Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online Data-Reduction / Compression #3

Open
besnardjb opened this issue Feb 17, 2022 · 1 comment
Open

Online Data-Reduction / Compression #3

besnardjb opened this issue Feb 17, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@besnardjb
Copy link

Tracing

Currently the data are stored in memory and therefore space might be exhausted. You may consider:

  1. Compression
  2. Flush to disk (when needed)
  3. Periodical aggregation

On a second step:

You write multiple files at the end would it be possible to write only one performing data-sorting in parallel ?

@dssgabriel
Copy link
Contributor

Compression

Interpol now outputs JSON a more compact but less readable format. This enable a compression rate of up to 30%. Ideas for further reduced trace sizes included compressing the JSON output (zlib?) or changing the output format altogether.

Flush to disk

Flush to disk has not been implemented yet, we need to think about how we could do that efficiently.

Parallel sorting

Traces are now sorted in parallel thanks to the rayon crate and outputted as a single JSON file (implemented in #14).

@dssgabriel dssgabriel added enhancement New feature or request help wanted Extra attention is needed labels May 13, 2022
@dssgabriel dssgabriel self-assigned this May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants