Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust/drivers): adbc driver for datafusion #2267

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tokoko
Copy link
Contributor

@tokoko tokoko commented Oct 21, 2024

Fixes #2263

An initial driver implementation for datafusion:

  • bunch of todos everywhere
  • options are not supported
  • get_objects ignores filters
  • get_objects takes depth into account, but only when returning results, not for info retrieval
  • minimal testing

P.S. I'm not much of a rust person, so I'd appreciate any kind of pointers

@github-actions github-actions bot added this to the ADBC Libraries 15 milestone Oct 21, 2024
@tokoko tokoko changed the title feat(rust/driver/datafusion): adbc driver for datafusion feat(rust/drivers/datafusion): adbc driver for datafusion Oct 21, 2024
@tokoko tokoko changed the title feat(rust/drivers/datafusion): adbc driver for datafusion feat(rust/drivers): adbc driver for datafusion Oct 21, 2024
Copy link
Contributor

@mbrobbel mbrobbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I'll split my review over multiple chunks.

The failing test is known to be flaky.

rust/Cargo.toml Outdated
@@ -34,6 +34,7 @@ categories = ["database"]

[workspace.dependencies]
adbc_core = { path = "./core" }
arrow = { version = "53.1.0", default-features = false }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding arrow we can add the arrow subcrates to reduce the size of our dependency tree.
It looks like we need to add:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. makes sense, didn't realize most of these functions could be imported from multiple crates. Added both as a dev-dependency.

rust/drivers/datafusion/README.md Outdated Show resolved Hide resolved
rust/drivers/datafusion/src/lib.rs Outdated Show resolved Hide resolved
rust/drivers/datafusion/tests/test_datafusion.rs Outdated Show resolved Hide resolved
@mbrobbel
Copy link
Contributor

To fix the failing CI job we need to install protoc (required by the substrait crate):

You could add

- name: Install Protoc
  uses: arduino/setup-protoc@v3
  with:
    repo-token: ${{ secrets.GITHUB_TOKEN }}

as a step after

- name: Use stable Rust
id: rust
run: |
rustup toolchain install stable --no-self-update
rustup default stable
.

@tokoko
Copy link
Contributor Author

tokoko commented Oct 23, 2024

@mbrobbel thanks, got everything green. I had to temporarily add #![allow(refining_impl_trait)] to lib.rs, couldn't get unimplemented methods that return Result<impl RecordBatchReader + Send> to pass clippy checks otherwise.

Copy link
Contributor

@mbrobbel mbrobbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good as an initial commit for this driver. The todo!()s and unwraps can be handled in follow-up PRs.

rust/Cargo.toml Outdated Show resolved Hide resolved
rust/drivers/datafusion/Cargo.toml Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ADBC driver for DataFusion
2 participants