Skip to content

v0.8.19: Stats-aware scanning, faster GPU index training, and prefiltering bug fixes

Compare
Choose a tag to compare
@westonpace westonpace released this 06 Dec 14:18
· 866 commits to main since this release

New Features

  • feat: a tensor dataset that shared with the same behavior as Lance torch Dataset by @eddyxu in #1679
  • feat: add option to pass in precomputed row_id -> ivf partiton mapping and compute partiiton on GPU by @chebbyChefNEQ in #1680
  • feat: add batch buffering and async loading to torch.LanceDataset by @chebbyChefNEQ in #1687
  • feat: optimized pushdown scanner by @wjones127 in #1328

Bug Fixes

  • fix: dont use scalar indices unless we are prefiltering by @westonpace in #1678
  • fix: lance pytorch dataset parameter to load with row_id by @eddyxu in #1676
  • fix: make sure to prefilter the flat portion of a combined knn by @westonpace in #1583

Performance Improvements

  • perf: use datafusion to shuffle index partition data by @wjones127 in #1645

Other Changes

  • chore: add utility to compute ground truth for benchmarks by @eddyxu in #1668
  • chore: add new python benchmarks for testing scalar indices by @westonpace in #1658

Full Changelog: v0.8.18...v0.8.19