Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading query results due to likely internal parsing differences #12655

Open
davisp opened this issue Sep 27, 2024 · 4 comments
Open

Misleading query results due to likely internal parsing differences #12655

davisp opened this issue Sep 27, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@davisp
Copy link
Member

davisp commented Sep 27, 2024

Describe the bug

I ran across a query where select * from foo where col < 1.3 returns values rendered as 1.3. My guess is that this is either a subtle parsing difference or possibly the result of a differently ordered type cast. If you change the column type from float to double the issue no longer occurs.

Also, just to preempt anyone jumping to floating point values are approximations, I'm fully aware. I know that this issue is likely caused either by the miniscule difference between f32::from_str and f64::from_str or f32 as f64 vs f64 as f32 or some combination thereof. However, given that this is all one parser, it feels like a bug that we've parsed two different values for the same string.

To Reproduce

DataFusion CLI v42.0.0
> create table test(col float);
0 row(s) fetched.
Elapsed 0.008 seconds.

> insert into test values (1.1), (1.2), (1.3), (1.4), (1.5);
+-------+
| count |
+-------+
| 5     |
+-------+
1 row(s) fetched.
Elapsed 0.007 seconds.

> select * from test where col > 1.3;
+-----+
| col |
+-----+
| 1.4 |
| 1.5 |
+-----+
2 row(s) fetched.
Elapsed 0.004 seconds.

> select * from test where col < 1.3
;
+-----+
| col |
+-----+
| 1.1 |
| 1.2 |
| 1.3 |
+-----+
3 row(s) fetched.
Elapsed 0.002 seconds.

Expected behavior

I would expect that 1.3 == 1.3.

Additional context

No response

@davisp davisp added the bug Something isn't working label Sep 27, 2024
@Eason0729
Copy link
Contributor

I guess datafusion use f64 by default(you need CAST(1.3 AS float)).

#[tokio::test]
async fn dev_test()->Result<()>{
    let ctx = SessionContext::new();

    ctx.sql("create table test(col float);").await?.show().await?;
    ctx.sql("insert into test values (1.1), (1.2), (1.3), (1.4), (1.5);").await?.show().await?;
    ctx.sql("select * from test where col > CAST(1.3 AS float);").await?.show().await?;
    ctx.sql("select * from test where col < CAST(1.3 AS float)").await?.show().await?;

    Ok(())
}

It print out following

++
++
+-------+
| count |
+-------+
| 5     |
+-------+
+-----+
| col |
+-----+
| 1.4 |
| 1.5 |
+-----+
+-----+
| col |
+-----+
| 1.1 |
| 1.2 |
+-----+

Tell me if it works in datafusion-cli.

@doupache
Copy link
Contributor

@Eason0729
Are you working on this issue? If you are not planning on working on this, I would like to work on it 😊

@Eason0729
Copy link
Contributor

@doupache
I am not planing to work on this, feel free to assign yourself.

@doupache
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants