Releases · lancedb/lance

08 Apr 17:31

wjones127

v0.10.10

ad528e9

v0.10.10: Easier S3 config

Features

feat: easier and consistent S3 configuration by @wjones127 in #2147
- When using AWS S3, you no longer have to specify the region.
- S3 configuration is now more consistent in how it is picked up. It will now always read explicit storage_options first before looking at environment variables.
feat: add encoders/decoders for basic types by @westonpace in #2142
feat: initial reader/writer for the v2 format by @westonpace in #2153
feat: add take to the v2 schedulers by @westonpace in #2156
feat: add ef as query parameter by @BubbleCal in #2155
feat: fast l2 for uint8 by @eddyxu in #2161
perf: use fast u8 l2 route in hnsw beam search by @eddyxu in #2164
feat: make substrait optional to allow avoiding libgit2 by @westonpace in #2168

Bug fixes

fix: set prost to minimum 0.12.2 by @albertlockett in #2167

Other changes

chore: apply clippy suggestions newly introduced in latest compiler by @westonpace in #2150

Full Changelog: v0.10.9...v0.10.10

Contributors

eddyxu, westonpace, and 3 other contributors

Assets 2

04 Apr 20:14

westonpace

v0.10.9

7c08c4f

v0.10.9 small robustness fix in cloud storage

What's Changed

feat: support IVF_HNSW_SQ in Python by @BubbleCal in #2149
feat: add outer retry loop to CloudObjectReader::size by @westonpace in #2151
docs: fix typo in lance-arrow by @rgbkrk in #2152

New Contributors

@rgbkrk made their first contribution in #2152

Full Changelog: v0.10.8...v0.10.9

Contributors

rgbkrk, westonpace, and BubbleCal

Assets 2

03 Apr 23:02

eddyxu

v0.10.8

6b30366

v0.10.8 bug fixes, support filter with count rows

What's Changed

fix: may get lower recall from HNSW with quantization by @BubbleCal in #2145
feat: add core traits for encoders & decoders in the v2 format by @westonpace in #2141
feat: support filter in count rows by @eddyxu in #2146
fix: variable file fragments by @wjones127 in #2148

Full Changelog: v0.10.7...v0.10.8

Contributors

eddyxu, westonpace, and 2 other contributors

Assets 2

02 Apr 21:54

wjones127

v0.10.7

34a35ba

v0.10.7: major bugfix for drop_columns, storage_options in Python

Bug fixes

❗ There was an bug with drop_columns(). If you've called this on your dataset, you should check if your dataset was affected by running dataset.validate(). If this raises an error, you can call dataset.delete("false") to force a repair operation on your dataset. Afterward it will work as expected.

fix: remove data files with all dropped columns by @wjones127 in #2130

New Features

🚀 You can now configure object storage connection in the kwargs of lance.dataset() and lance.write_dataset() with storage_options. For example:

import lance
ds = lance.dataset(
    "s3://bucket/path",
    storage_options={
        "region": "us-east-1",
        "access_key_id": "my-access-key",
        "secret_access_key": "my-secret-key",
        "session_token": "my-session-token",
    }
)

feat(python): expose storage options by @wjones127 in #2131
feat: extend datagen to cover more types by @westonpace in #2138
feat: add a protobuf file describing encodings by @westonpace in #2137
feat: add a basic encodings crate by @westonpace in #2139
feat: support IVF_HNSW_SQ by @BubbleCal in #2136

Other Changes

chore: expose dynamic projection on fragment API by @chebbyChefNEQ in #2144

Full Changelog: v0.10.6...v0.10.7

Contributors

westonpace, wjones127, and 2 other contributors

Assets 2

01 Apr 16:22

westonpace

v0.10.6

92a3e9a

v0.10.6 Better fp16 perf in python, fix memory issues in scalar indices

What's Changed

feat: expose migration check by @wjones127 in #2074
chore: loading HNSW levels in parallel by @BubbleCal in #2093
chore: construct the dist table only once while searching by @BubbleCal in #2094
feat: enable fp16kernels in Mac and x86 Linux Python wheels by @wjones127 in #2098
feat: support IVF_HNSW index by @BubbleCal in #2080
perf: create ood dataset in bigann benchmark by @eddyxu in #2084
chore: fix the time unit in logs by @BubbleCal in #2101
chore: dynamically detect the schema while shuffling data by @BubbleCal in #2105
chore: expose find_partition method by @chebbyChefNEQ in #2106
fix: very low recall on IVF_HNSW by @BubbleCal in #2104
fix: load_partition return Error when partition_id out of bounds by @LeoReeYang in #2107
chore: expose query residulization by @chebbyChefNEQ in #2108
chore: move java core api to sub java module by @LuQQiu in #2115
docs: update image_to_tensor to to_tensor by @vipul-maheshwari in #2116
perf: independent parallel building for IVF_HNSW partitions by @BubbleCal in #2109
chore: add example of IVF_HNSW by @BubbleCal in #2112
perf: fully building HNSW partitions in parallel by @BubbleCal in #2117
perf: load HNSW levels in parallel by @BubbleCal in #2111
chore: drop data after copied to reduce memory footprint by @BubbleCal in #2120
fix: populate index cache at the end of loading by @chebbyChefNEQ in #2123
fix: use the fair spill pool instead of the greedy spill pool by @westonpace in #2126
feat: support create IVF_HNSW_PQ index in Python by @BubbleCal in #2127
feat: add scalar quantizer by @BubbleCal in #2134
fix: the HNSW index doesn't respect to refine factor by @BubbleCal in #2122
feat: add sq storage and transformer by @BubbleCal in #2135

New Contributors

@LeoReeYang made their first contribution in #2107
@vipul-maheshwari made their first contribution in #2116

Full Changelog: v0.10.5...v0.10.6

Contributors

eddyxu, westonpace, and 6 other contributors

Assets 2

20 Mar 20:51

westonpace

v0.10.5

117dbb4

v0.10.5 fix panic when reading datasets

Fixes a potential panic when reading a fragment that had multiple data files.

What's Changed

chore: expose internal index APIs by @chebbyChefNEQ in #2082
chore: expose prefilter traits by @chebbyChefNEQ in #2083
feat: add java fragment create by @LuQQiu in #2081
chore: enable codecov by @chebbyChefNEQ in #2088
perf: add a set of benchmark dataset by @eddyxu in #2090
perf: text2image benchmark by @eddyxu in #2091
docs: add llm training example by @tanaymeh in #2087
fix: only read the data file's fields from the page table and not the whole dataset's fields by @westonpace in #2095

New Contributors

@LuQQiu made their first contribution in #2081

Full Changelog: v0.10.4...v0.10.5

Contributors

eddyxu, westonpace, and 3 other contributors

Assets 2

16 Mar 04:37

westonpace

v0.10.4

6c5bc48

v0.10.4 Faster merge insert, fix for compaction race condition

What's Changed

fix: assume default rust toolchain as stable by @kerryeon in #2055
chore: extend JNI to get strings by @eddyxu in #2047
chore: bump datafusion version by @universalmind303 in #2035
chore: skip empty batch while chunking batches by @BubbleCal in #2026
chore: utility to convert RecordBatchStream to FFI_ArrowArrayStream by @eddyxu in #2065
feat: use a scalar index, if available, during a merge insert operation by @westonpace in #1987
chore: write HNSW partitions by @BubbleCal in #2056
chore: add struct for parameters of HNSW by @BubbleCal in #2057
docs: add llm dataset creation example by @tanaymeh in #2060
fix: update merge_insert code to use latest df version by @westonpace in #2071
feat: support filter while searching in HNSW by @BubbleCal in #2058
feat(java): fragment reader by @eddyxu in #2072
fix: force fragments to be stored in the manifest in id-order by @westonpace in #2075
chore: building IVF_HNSW index by @BubbleCal in #2066
fix: fix bug in indexed merge insert where new data could cause merge insert to panic by @westonpace in #2076
feat: emit warnings if f16 kernels not built by @westonpace in #2077

New Contributors

@kerryeon made their first contribution in #2055
@tanaymeh made their first contribution in #2060

Full Changelog: v0.10.3...v0.10.4

Contributors

eddyxu, westonpace, and 4 other contributors

Assets 2

12 Mar 14:40

westonpace

v0.10.3

1a950b4

v0.10.3 Temporal scalar indices and low RAM scalar index training

What's Changed

ci: fix compilers for release by @wjones127 in #2032
fix: use None not zero for limit default by @wjones127 in #2033
fix: stronger numeric guarantees for distance kernels by @wjones127 in #2013
feat(java): dataset get fragments by @eddyxu in #2034
feat: config plumbing for vector benchmark framework by @chebbyChefNEQ in #2036
fix: pin away from pyarrow 15.0.1 as it is causing segmentation fault in tests by @westonpace in #2045
chore: upgrade chrono and fix deprecation warnings by @eddyxu in #2048
feat: add support for scalar indices on temporal columns by @westonpace in #1968
feat: use out-of-core sort to train btree indices by @westonpace in #2043
chore: store ivf in arrow schema by @eddyxu in #2053
chore: on disk pq storage by @eddyxu in #2049

Full Changelog: v0.10.2...v0.10.3

Contributors

eddyxu, westonpace, and 2 other contributors

Assets 2

04 Mar 19:46

eddyxu

v0.10.2

d0df19e

v0.10.2 Cosine, HNSW, f16 bug fixes

What's Changed

chore(java): provide conversion utilities between jni objects and rust objects by @eddyxu in #2009
perf: avoid re-calculating the distances while building HNSW by @BubbleCal in #2010
chore: select less but enough neighbors to establish edges by @BubbleCal in #2011
fix: fp16 kernels computed in f32 by @wjones127 in #1990
chore: add fixture test for IVF index by @chebbyChefNEQ in #2014
perf: greedy search for finding entry point by @BubbleCal in #2012
chore: normalize transform by @eddyxu in #2017
refactor: refactor ivf into transformer by @eddyxu in #2023
docs: fix typo by @krlmlr in #2025
chore: fix cosine residual calculation by @eddyxu in #2015
feat: add recall report notebook by @chebbyChefNEQ in #2018
ci: fix nightly build on apple silicon GHA by @eddyxu in #2024
chore: update HNSW example by @BubbleCal in #2016

New Contributors

@krlmlr made their first contribution in #2025

Full Changelog: v0.10.1...v0.10.2

Contributors

eddyxu, krlmlr, and 3 other contributors

Assets 2

28 Feb 03:07

chebbyChefNEQ

v0.10.1

e9cd804

v0.10.1 jvm support poc and fix bug with selecting nested field

What's Changed

feat: implement java bindings by @beinan in #1928
chore: simplify output schema in scanner by @chebbyChefNEQ in #1999
fix: escape column names correctly by @chebbyChefNEQ in #2007
fix: crate publish by @chebbyChefNEQ in #2008

New Contributors

@beinan made their first contribution in #1928

Full Changelog: v0.10.0...v0.10.1

Contributors

beinan and chebbyChefNEQ

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features

Bug fixes

Other changes

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

Bug fixes

New Features

Other Changes

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: lancedb/lance

v0.10.10: Easier S3 config

Features

Bug fixes

Other changes

Contributors

v0.10.9 small robustness fix in cloud storage

What's Changed

New Contributors

Contributors

v0.10.8 bug fixes, support filter with count rows

What's Changed

Contributors

v0.10.7: major bugfix for drop_columns, storage_options in Python

Bug fixes

New Features

Other Changes

Contributors

v0.10.6 Better fp16 perf in python, fix memory issues in scalar indices

What's Changed

New Contributors

Contributors

v0.10.5 fix panic when reading datasets

What's Changed

New Contributors

Contributors

v0.10.4 Faster merge insert, fix for compaction race condition

What's Changed

New Contributors

Contributors

v0.10.3 Temporal scalar indices and low RAM scalar index training

What's Changed

Contributors

v0.10.2 Cosine, HNSW, f16 bug fixes

What's Changed

New Contributors

Contributors

v0.10.1 jvm support poc and fix bug with selecting nested field

What's Changed

New Contributors

Contributors