v0.5.0: Deletion and performance improvements
What's Changed
New features
- Delete rows in a dataset with
Dataset.delete()
- Add delete() method to Python API by @wjones127 in #953
- feat: handle deletes in count_rows, updater, merge by @wjones127 in #995
- feat: add feature flags to the dataset manifest by @wjones127 in #979
- You can now customize the log level to reduce verbose messages
- Negative numbers are now fully supported in SQL expressions
- [rust] Added support for minus operator for numeric literals by @trueutkarsh in #983
- openblas is now an optional dependency
- Add feature flag for opq by @TevinWang in #998
- Remove a directory (dataset) recursively. by @eddyxu in #1011
Bugfixes
- fix s3 get_range issues: 416, range not satisfiable by @LiWeiJie in #975
- fix: make sure we are always running with a single partition by @wjones127 in #977
- fix: handle NaN values in argmin by @wjones127 in #1000
Performance Improvments
- [Rust] Reduce arrow overhead during kmean training by @eddyxu in #990
- Zero copy during kmean membership computation by @eddyxu in #992
- [Rust] Improve kmean training performance by removing dynamic dispatch by @eddyxu in #996
Other
- upgrade to arrow 40 and datafusion 26 by @chebbyChefNEQ in #1005
- bump arrow versions on duckdb intergration as well by @chebbyChefNEQ in #1007
- Remove a directory (dataset) recursively. by @eddyxu in #1011
New Contributors
- @trueutkarsh made their first contribution in #983
- @TevinWang made their first contribution in #998
Full Changelog: v0.4.21...v0.5.0