Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): add binary to train dictionary compression #1668

Closed
wants to merge 2 commits into from

Conversation

DvirYo-starkware
Copy link
Contributor

@DvirYo-starkware DvirYo-starkware commented Feb 4, 2024

Pull Request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Build-related changes
  • Documentation content changes
  • Other (please describe):

What is the current behavior?

Issue Number: N/A

What is the new behavior?

Does this introduce a breaking change?

  • Yes
  • No

Other information


This change is Reviewable

Copy link
Contributor Author

@DvirYo-starkware DvirYo-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @dan-starkware)

a discussion (no related file):
I need to make sure the training is going well, that the dictionary we getting is a good one, and that the training scales well for dozens of GB of training data.
To do that I need big storage. Next time I will have a good internet connection I will sync a node locally and test this. For the first few thousand blocks, it works fine.
There is some weirdness in the training function. I opened an issue about that:
gyscos/zstd-rs#260


Copy link

codecov bot commented Feb 4, 2024

Codecov Report

Attention: 126 lines in your changes are missing coverage. Please review.

Comparison is base (0937db1) 73.39% compared to head (1c01fd6) 72.55%.

❗ Current head 1c01fd6 differs from pull request most recent head 0b5ac5c. Consider uploading reports for the commit 0b5ac5c to get more accurate results

Files Patch % Lines
...orage/src/bin/train_compression_dictionary/main.rs 26.31% 126 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1668      +/-   ##
==========================================
- Coverage   73.39%   72.55%   -0.84%     
==========================================
  Files         119      120       +1     
  Lines       15781    15951     +170     
  Branches    15781    15951     +170     
==========================================
- Hits        11582    11574       -8     
- Misses       2423     2609     +186     
+ Partials     1776     1768       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

auto-merge was automatically disabled February 8, 2024 16:15

Merge queue setting changed

@github-actions github-actions bot locked and limited conversation to collaborators Feb 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant