Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: test write methods against more varied layouts #2233

Merged
merged 6 commits into from
Apr 26, 2024

Conversation

wjones127
Copy link
Contributor

@wjones127 wjones127 commented Apr 19, 2024

Adds a test utility to generate a more hostile layout. Adds to one test function. Will add to more in follow up PRs.

@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2024

Codecov Report

Attention: Patch coverage is 93.72760% with 35 lines in your changes are missing coverage. Please review.

Project coverage is 81.42%. Comparing base (dbfb640) to head (dc56e20).

Files Patch % Lines
rust/lance/src/dataset/fragment/write.rs 85.27% 13 Missing and 16 partials ⚠️
rust/lance/src/utils/test.rs 99.04% 0 Missing and 3 partials ⚠️
rust/lance/src/dataset/write.rs 0.00% 2 Missing ⚠️
rust/lance/src/dataset.rs 95.65% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2233      +/-   ##
==========================================
+ Coverage   81.23%   81.42%   +0.18%     
==========================================
  Files         187      189       +2     
  Lines       54698    55176     +478     
  Branches    54698    55176     +478     
==========================================
+ Hits        44434    44925     +491     
+ Misses       7771     7741      -30     
- Partials     2493     2510      +17     
Flag Coverage Δ
unittests 81.42% <93.72%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

/// Convert reader to a stream.
///
/// The reader will be called in a background thread.
pub fn reader_to_stream(batches: Box<dyn RecordBatchReader + Send>) -> SendableRecordBatchStream {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found we sometimes didn't need to peek the reader, so separated this into two different functions: peek_reader_schema() and reader_to_stream().

Comment on lines +3491 to 3531
let batches = vec![data.slice(0, 50), data.slice(50, 50)];
let mut dataset = TestDatasetGenerator::new(batches)
.make_hostile(test_uri)
.await;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one test I've integrated this with. In follow up PRs, we can use this in additional tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current Fragment::create() API didn't allow passing an explicit lance::Schema, but we need that to generate layouts with specific field ids. So I moved this all over to a nice builder.

@wjones127 wjones127 marked this pull request as ready for review April 22, 2024 18:58
Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty cool.

Comment on lines +87 to +93
let mut writer = FileWriter::<ManifestDescribing>::try_new(
&object_store,
&full_path,
schema,
&Default::default(),
)
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no different than before but I suppose we will need to update this path to use the v2 writer as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've added a TODO to #1929

@wjones127 wjones127 merged commit d0c313a into lancedb:main Apr 26, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants