Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Nessie GC - Dropped table files are not getting deleted #9097

Open
shraddhagrawal opened this issue Jul 15, 2024 · 4 comments
Open

[Bug]: Nessie GC - Dropped table files are not getting deleted #9097

shraddhagrawal opened this issue Jul 15, 2024 · 4 comments

Comments

@shraddhagrawal
Copy link

What happened

Dropped table files are not getting deleted from storage

How to reproduce it

  1. Create table and insert few records in 3-4 commits
  2. Delete table created in step1
  3. Create new table and insert few records in 3-4 commits
  4. Execute nessie mark-live command with default cut off as 2 commits
  5. Execute expire command with defer-deletes and then execute list-deferred command to get of list of files and there is no file from table1 as this table is already deleted all files including data should be deleted
  6. Execute deferred-deletes command and table1 files are not getting deleted

Nessie server type (docker/uber-jar/built from source) and version

docker

Client type (Ex: UI/Spark/pynessie ...) and version

Spark

Additional information

No response

@shraddhagrawal shraddhagrawal changed the title [Bug]: Nessie GC - Deleted table files are not getting deleted [Bug]: Nessie GC - Dropped table files are not getting deleted Jul 15, 2024
@snazy
Copy link
Member

snazy commented Jul 15, 2024

Outlined in the docs here

@shraddhagrawal
Copy link
Author

shraddhagrawal commented Jul 15, 2024

@snazy Could you help me with steps to replicate
second point
the content references (aka Iceberg snapshots) are no longer used. This information can be used to no longer expose the affected e.g. Iceberg snapshots in any table metadata.

@snazy
Copy link
Member

snazy commented Jul 15, 2024

The docs say "Future Enhancements" - it's not there.

@loicalleyne
Copy link

Has anyone found a workflow to handle deleting orphaned completely unreferenced files from object store?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants