Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map conversion fails for decimal types in test_all_types Iceberg table #2996

Open
andrewgazelka opened this issue Oct 3, 2024 · 0 comments
Labels
bug Something isn't working p1 Important to tackle soon, but preemptable by p0

Comments

@andrewgazelka
Copy link
Member

andrewgazelka commented Oct 3, 2024

Map conversion fails for decimal types in test_all_types Iceberg table

Description

The test_daft_iceberg_table_collect_correct test is failing for the test_all_types table. The error occurs when trying to convert a Map with decimal types to an Arrow array. This issue seems to be related to a mismatch in the decimal precision between the Iceberg schema and the Daft/Arrow representation.

Error Message

pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: InvalidArgumentError("MapArray expects `field.data_type` to match its inner DataType, but found \nStruct([Field { name: \"key\", data_type: Int64, is_nullable: true, metadata: {} }, Field { name: \"value\", data_type: Decimal(32, 32), is_nullable: true, metadata: {} }])\nvs\n\n\nField { name: \"entries\", data_type: Struct([Field { name: \"key\", data_type: Int64, is_nullable: true, metadata: {} }, Field { name: \"value\", data_type: Decimal(10, 2), is_nullable: true, metadata: {} }]), is_nullable: true, metadata: {} }")

Context

  • The test_all_types table was commented out in the WORKING_SHOW_COLLECT list to temporarily fix the issue in a previous PR.
  • The problem seems to be related to the round-trip conversion of decimal types, especially in the context of Map data structures.
  • It's likely connected to the recent changes in recursively changing types.

Steps to Reproduce

  1. Run the Iceberg integration tests with pytest tests/integration/iceberg -m 'integration'
  2. Observe the failure in test_daft_iceberg_table_collect_correct for the test_all_types table.

Expected Behavior

The test should pass, correctly handling the decimal types within the Map structure.

Actual Behavior

The test fails due to a mismatch in the decimal precision between the Iceberg schema (Decimal(32, 32)) and the Daft/Arrow representation (Decimal(10, 2)).

Possible Solutions

  1. Investigate the decimal type conversion process in the Iceberg to Daft/Arrow pipeline.
  2. Consider adding a mechanism to preserve the original decimal precision during the conversion.
@jaychia jaychia added the bug Something isn't working label Oct 7, 2024
@andrewgazelka andrewgazelka added the p1 Important to tackle soon, but preemptable by p0 label Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p1 Important to tackle soon, but preemptable by p0
Projects
None yet
Development

No branches or pull requests

2 participants