-
-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chunks index caching #8403
chunks index caching #8403
Conversation
Code is a bit less pretty now, but more efficient. Also less stats. |
16d7c8a
to
dcd3090
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8403 +/- ##
==========================================
+ Coverage 81.44% 81.54% +0.10%
==========================================
Files 70 70
Lines 12739 12791 +52
Branches 2311 2318 +7
==========================================
+ Hits 10375 10431 +56
+ Misses 1707 1703 -4
Partials 657 657 ☔ View full report in Codecov by Sentry. |
dcd3090
to
ab3813f
Compare
Is this (also) addressing the issue of |
yes!
even if a new client works with a repo the first time, it will fetch that cached index from the repo and use it if it is valid.
borg 1.x had to do a chunks cache sync in that case, building per-archive chunks indexes from all archives and then merging them all into the main chunks index.
|
c2436fb
to
49c3bd6
Compare
borg compact now uses ChunkIndex (a specialized, memory-efficient data structure), so it needs less memory now. Also, it saves that chunks index to cache/chunks in the repository. When the chunks index is needed, it is first tried to get it from cache/chunks. If that fails, fall back to building the chunks index via repository.list(), which can be rather slow and immediately cache the resulting ChunkIndex in the repo. borg check --repair currently just deletes the chunks cache, because it might have deleted some invalid chunks in the repo. cache.close now saves the chunks index to cache/chunks in repo if it was modified. thus, borg create will update the cached chunks index with new chunks. cache/chunks_hash can be used to validate cache/chunks (and also to validate / invalidate locally cached copies of that).
f293694
to
36e3d63
Compare
@mirko merged this. also found another issues that it was doing one full repo.list too much, PR incoming soon. so, master branch should be quite a bit faster now. only check and compact are expected to always do the repository.list(), just to be on the safe side and not rely on caches. |
borg compact uses ChunkIndex (a specialized, memory-efficient data structure), so it needs less memory now. Also, it saves that chunks index to cache/chunks in the repository.
When the chunks index is needed, it is first tried to get it from cache/chunks and only fall back to building the chunks index via repository.list() (which can be rather slow).
borg check --repair currently just invalidates the chunks cache.
borg create updates the chunks cache.