Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LanceDB - Remove Orphaned Chunks #1620

Open
wants to merge 145 commits into
base: devel
Choose a base branch
from

Commits on Jul 16, 2024

  1. Add tests for LanceDB chunking and merging functionality

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 16, 2024
    Configuration menu
    Copy the full SHA
    68e26a0 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 16, 2024
    Configuration menu
    Copy the full SHA
    7c2d031 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 17, 2024
    Configuration menu
    Copy the full SHA
    4c555e9 View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 18, 2024
    Configuration menu
    Copy the full SHA
    6c734d7 View commit details
    Browse the repository at this point in the history
  2. Add TSplitter type alias for LanceDB document splitting function

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 18, 2024
    Configuration menu
    Copy the full SHA
    900c4fa View commit details
    Browse the repository at this point in the history
  3. Refine typing for chunks

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 18, 2024
    Configuration menu
    Copy the full SHA
    16230a7 View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 19, 2024
    Configuration menu
    Copy the full SHA
    d3aeda2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3f7a82f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1dda1d5 View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2024

  1. Configuration menu
    Copy the full SHA
    48e14ab View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    32fe174 View commit details
    Browse the repository at this point in the history
  3. Refactor LanceDB client and tests for improved readability and type s…

    …afety
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 21, 2024
    Configuration menu
    Copy the full SHA
    d974962 View commit details
    Browse the repository at this point in the history
  4. Linting

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 21, 2024
    Configuration menu
    Copy the full SHA
    e6cdf5d View commit details
    Browse the repository at this point in the history

Commits on Jul 23, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 23, 2024
    Configuration menu
    Copy the full SHA
    c7c2bc6 View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 24, 2024
    Configuration menu
    Copy the full SHA
    bf3c3d8 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    9c11964 View commit details
    Browse the repository at this point in the history

Commits on Jul 27, 2024

  1. Configuration menu
    Copy the full SHA
    a60737a View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2024

  1. Merge remote-tracking branch 'origin/devel' into 1587-lancedb-support…

    …-efficient-update-strategy-for-chunked-documents
    Pipboyguy committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    cfe1a6d View commit details
    Browse the repository at this point in the history
  2. Remove resolved comments

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    518a507 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c10bd73 View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    24ada84 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5b3acb1 View commit details
    Browse the repository at this point in the history
  3. Add test for removing orphaned records in LanceDB

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    cf6d86a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d338586 View commit details
    Browse the repository at this point in the history
  5. Set test pipeline as dev mode

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    2376c6a View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7f6f1cd View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b840f8b View commit details
    Browse the repository at this point in the history
  2. Add FollowupJob trait to LoadLanceDBJob

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    c276211 View commit details
    Browse the repository at this point in the history
  3. Fix file type

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    dbfd5af View commit details
    Browse the repository at this point in the history
  4. Fix file typing

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    257fbde View commit details
    Browse the repository at this point in the history
  5. Add test for removing orphaned records in LanceDB root table

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    0502ddf View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2363b51 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a296c77 View commit details
    Browse the repository at this point in the history
  2. Use doc id hint for top level tables

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    6b363d1 View commit details
    Browse the repository at this point in the history
  3. Only join on join columns for orphan removal job

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    aac7647 View commit details
    Browse the repository at this point in the history
  4. Add ollama to supported embedding providers and test orphaned record …

    …removal with embeddings
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e33b7cf View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    afa7573 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f2913e9 View commit details
    Browse the repository at this point in the history
  3. Formatting

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    ffe6584 View commit details
    Browse the repository at this point in the history
  4. Set default file size to 128MB

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 2, 2024
    Configuration menu
    Copy the full SHA
    0368018 View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2024

  1. Merge branch 'refs/heads/devel' into 1587-lancedb-support-efficient-u…

    …pdate-strategy-for-chunked-documents
    Pipboyguy committed Aug 3, 2024
    Configuration menu
    Copy the full SHA
    29fa7fd View commit details
    Browse the repository at this point in the history
  2. Only use parquet loader file formats

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 3, 2024
    Configuration menu
    Copy the full SHA
    02704d5 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2024

  1. Import pyarrow.parquet

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 4, 2024
    Configuration menu
    Copy the full SHA
    eae056a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dc20a55 View commit details
    Browse the repository at this point in the history
  3. Update LanceDB client to use more efficient batch processing methods …

    …on loading for Load Jobs
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 4, 2024
    Configuration menu
    Copy the full SHA
    6ed540b View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2024

  1. Refactor unique identifier handling for LanceDB tables

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    0a9682f View commit details
    Browse the repository at this point in the history
  2. Optimize UUID column generation for LanceDB tables

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    a99224a View commit details
    Browse the repository at this point in the history
  3. Refactor LanceDBClient to use string type hints for Table

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    895331b View commit details
    Browse the repository at this point in the history
  4. Minor refactor

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    a881e7a View commit details
    Browse the repository at this point in the history
  5. Implement efficient schema update with Nullability support

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    7f245e2 View commit details
    Browse the repository at this point in the history
  6. Optimize orphaned chunks removal for large datasets

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 5, 2024
    Configuration menu
    Copy the full SHA
    4fc73dd View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2024

  1. Projection pushdown

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 6, 2024
    Configuration menu
    Copy the full SHA
    9378f50 View commit details
    Browse the repository at this point in the history
  2. Format

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 6, 2024
    Configuration menu
    Copy the full SHA
    9b14583 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e21f61b View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. Add recommended file size for LanceDB destination

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    9725d0e View commit details
    Browse the repository at this point in the history
  2. Improve comment clarity for projection push-down in LanceDB

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    5238c11 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    
    # Conflicts:
    #	dlt/destinations/impl/lancedb/lancedb_client.py
    Pipboyguy committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    8e74815 View commit details
    Browse the repository at this point in the history
  4. Update to new load interface

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    c8f7468 View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2024

  1. Remove unnecessary LanceDBLoadJob attributes

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    af56191 View commit details
    Browse the repository at this point in the history
  2. Change instance attributes to run method as variables

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 8, 2024
    Configuration menu
    Copy the full SHA
    7e33011 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    e24e961 View commit details
    Browse the repository at this point in the history
  2. Schedule follow up refernce job

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    ee7dd02 View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2024

  1. Add follow up lancedb remove orphan job skeleron

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    df498ab View commit details
    Browse the repository at this point in the history
  2. Write empty follow up file

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    c08f1ba View commit details
    Browse the repository at this point in the history
  3. Write parquet

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    f9f94e3 View commit details
    Browse the repository at this point in the history
  4. Add support for reference file format in LanceDB destination

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    b374b0b View commit details
    Browse the repository at this point in the history
  5. Handle parent table name resolution if it doesn't exist in Lance db r…

    …emove orphan job
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 10, 2024
    Configuration menu
    Copy the full SHA
    2ed3301 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    cb0ba1f View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 14, 2024
    Configuration menu
    Copy the full SHA
    99ac100 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. Refactor specialised orphan follow up job back to reference job

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    0694859 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    ad3b750 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    4701c6e View commit details
    Browse the repository at this point in the history
  2. Refactor orphan removal for chunked documents

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    537a2be View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2024

  1. Fix dlt system table check for name instead of object

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 18, 2024
    Configuration menu
    Copy the full SHA
    3d25306 View commit details
    Browse the repository at this point in the history
  2. Implement staging methods

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 18, 2024
    Configuration menu
    Copy the full SHA
    2ee8da1 View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2024

  1. Override staging client methods

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 19, 2024
    Configuration menu
    Copy the full SHA
    2947d55 View commit details
    Browse the repository at this point in the history
  2. Docs

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 19, 2024
    Configuration menu
    Copy the full SHA
    ea5914c View commit details
    Browse the repository at this point in the history
  3. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 19, 2024
    Configuration menu
    Copy the full SHA
    2e7daed View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5018adf View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2024

  1. Override staging client methods

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1fcce51 View commit details
    Browse the repository at this point in the history
  2. Delete with inserts

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8849f11 View commit details
    Browse the repository at this point in the history
  3. Keep with batch reader

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c7098fd View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2024

  1. Configuration menu
    Copy the full SHA
    92ba767 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    abd9b01 View commit details
    Browse the repository at this point in the history
  2. Remove Lancedb client's staging implementation

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    d8ddcae View commit details
    Browse the repository at this point in the history
  3. Insert in memory arrow table. This will be optimized

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    17137a6 View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2024

  1. Configuration menu
    Copy the full SHA
    1b0b7bb View commit details
    Browse the repository at this point in the history
  2. Rename classes to the new job implementation classes

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    53d896a View commit details
    Browse the repository at this point in the history
  3. Use namedtuple for table chain to improve readability

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    26ba0f5 View commit details
    Browse the repository at this point in the history
  4. Remove orphans by loading all ancestor IDs simultaneously

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    06e04d9 View commit details
    Browse the repository at this point in the history
  5. Fix doc_id adapter

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    470315e View commit details
    Browse the repository at this point in the history
  6. Fix doc_id adapter

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    43eb5b4 View commit details
    Browse the repository at this point in the history
  7. Revert to previous

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    40a5e73 View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. Configuration menu
    Copy the full SHA
    04c8489 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8cd6003 View commit details
    Browse the repository at this point in the history
  3. Remove doc_id hint

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    dad103e View commit details
    Browse the repository at this point in the history
  4. Infer merge key if not supplied from provided primary key

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    15a0cf6 View commit details
    Browse the repository at this point in the history
  5. Remove unused utility functions

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    e9462e3 View commit details
    Browse the repository at this point in the history
  6. Remove LanceDB doc ID hints and use schema normalizer

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    8af98d7 View commit details
    Browse the repository at this point in the history
  7. LanceDB writes strange code

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    4195bb4 View commit details
    Browse the repository at this point in the history
  8. Minor Formatting

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    2573d3a View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    19e9366 View commit details
    Browse the repository at this point in the history
  2. Support compound primary and merge keys

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    86c198c View commit details
    Browse the repository at this point in the history
  3. Remove old comment

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    aa03930 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2024

  1. Configuration menu
    Copy the full SHA
    fb72c03 View commit details
    Browse the repository at this point in the history
  2. - Change default vector column name to "vector" to conform with lance…

    …db standard
    
    - Add search tests with tantivy as search engine
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    d1e4173 View commit details
    Browse the repository at this point in the history
  3. Format and fix linting

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    613f5bc View commit details
    Browse the repository at this point in the history
  4. Add custom embedding function registration test

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    703c4a8 View commit details
    Browse the repository at this point in the history
  5. Spawn process in test to make sure registry can be deserialized from …

    …arrow files
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c07c8fc View commit details
    Browse the repository at this point in the history
  6. Simplify null string handling

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    8afa7e1 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2024

  1. Configuration menu
    Copy the full SHA
    2395432 View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2024

  1. Configuration menu
    Copy the full SHA
    2507d22 View commit details
    Browse the repository at this point in the history
  2. Update default vector column name in docs

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Aug 31, 2024
    Configuration menu
    Copy the full SHA
    9a347e6 View commit details
    Browse the repository at this point in the history

Commits on Sep 2, 2024

  1. Configuration menu
    Copy the full SHA
    4eda894 View commit details
    Browse the repository at this point in the history
  2. Set remove_orphans flag to False on tests that don't require it

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    c0bedb7 View commit details
    Browse the repository at this point in the history
  3. Merge branch '1765-lancedb-destination-cant-query-generated-tables' i…

    …nto remove-lancedb-doc-id-hints
    
    # Conflicts:
    #	dlt/destinations/impl/lancedb/lancedb_client.py
    #	docs/website/docs/dlt-ecosystem/destinations/lancedb.md
    #	tests/load/lancedb/test_pipeline.py
    #	tests/load/lancedb/utils.py
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    99a4f44 View commit details
    Browse the repository at this point in the history
  4. Implement starter arrow string placeholder function

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    5f0d620 View commit details
    Browse the repository at this point in the history
  5. Add test for empty arrow string element vectorised replacement utilit…

    …y function
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b7f3076 View commit details
    Browse the repository at this point in the history
  6. Handle NULL values in addition to empty strings in arrow substitution…

    … method
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    e3a4ed0 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    4ec894f View commit details
    Browse the repository at this point in the history
  8. Format

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    9866874 View commit details
    Browse the repository at this point in the history
  9. Bump pyarrow version

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    7099d5f View commit details
    Browse the repository at this point in the history
  10. Use pa.nulls instead of [None]*len

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    1c770d1 View commit details
    Browse the repository at this point in the history
  11. Update tests

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    0b11ac7 View commit details
    Browse the repository at this point in the history
  12. Invert remove orphans flag

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    e81736e View commit details
    Browse the repository at this point in the history
  13. Implement root table orphan deletion, only integer doc_ids

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    36abec7 View commit details
    Browse the repository at this point in the history
  14. Cater for string ids as well in doc_id removal process

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    5ceeda9 View commit details
    Browse the repository at this point in the history
  15. Fix test with wrong primary key

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a8f9c3b View commit details
    Browse the repository at this point in the history
  16. Just send list of ids as is. don't pc.compute on client end

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b3baf93 View commit details
    Browse the repository at this point in the history
  17. Extract schema matching into utils

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    589071c View commit details
    Browse the repository at this point in the history
  18. Add utils

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a86a13a View commit details
    Browse the repository at this point in the history
  19. Pass all tests

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    0eba25e View commit details
    Browse the repository at this point in the history
  20. Minor format and cleanup

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    2b7f4c6 View commit details
    Browse the repository at this point in the history
  21. Merge branch 'remove-lancedb-doc-id-hints' into 1587-lancedb-support-…

    …efficient-update-strategy-for-chunked-documents
    
    # Conflicts:
    #	dlt/destinations/impl/lancedb/lancedb_adapter.py
    #	tests/load/lancedb/test_merge.py
    Pipboyguy committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    105b388 View commit details
    Browse the repository at this point in the history

Commits on Sep 3, 2024

  1. Docs

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    ea36b00 View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    
    # Conflicts:
    #	dlt/destinations/impl/lancedb/lancedb_client.py
    #	docs/website/docs/dlt-ecosystem/destinations/lancedb.md
    #	poetry.lock
    #	tests/load/lancedb/test_pipeline.py
    #	tests/load/lancedb/utils.py
    Pipboyguy committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    2010722 View commit details
    Browse the repository at this point in the history
  2. Amend replace test to test with large number of records to catch race…

    … conditions with replace disposition
    
    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    81eaea9 View commit details
    Browse the repository at this point in the history
  3. Fix replace race conditions by delegating truncation to dlt

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    f6d243a View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    Pipboyguy committed Sep 6, 2024
    Configuration menu
    Copy the full SHA
    3521975 View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    
    # Conflicts:
    #	poetry.lock
    Pipboyguy committed Sep 8, 2024
    Configuration menu
    Copy the full SHA
    e280001 View commit details
    Browse the repository at this point in the history
  2. Update lock file

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 8, 2024
    Configuration menu
    Copy the full SHA
    f32d4cd View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. Merge branch 'devel' into 1587-lancedb-support-efficient-update-strat…

    …egy-for-chunked-documents
    
    # Conflicts:
    #	dlt/destinations/impl/lancedb/factory.py
    #	dlt/destinations/impl/lancedb/lancedb_client.py
    #	dlt/destinations/impl/lancedb/schema.py
    #	poetry.lock
    #	tests/load/lancedb/test_pipeline.py
    Pipboyguy committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    a804e03 View commit details
    Browse the repository at this point in the history
  2. Refactor type mapping and schema handling in LanceDB client

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    7bd2e9c View commit details
    Browse the repository at this point in the history
  3. Change 'complex' column type to 'json' in LanceDB client

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    d8a6b75 View commit details
    Browse the repository at this point in the history
  4. update lock file

    Signed-off-by: Marcel Coetzee <[email protected]>
    Pipboyguy committed Sep 24, 2024
    Configuration menu
    Copy the full SHA
    a5a1657 View commit details
    Browse the repository at this point in the history