ENH: Use brotli compression by default if possible for nextstrain.org requests #214

corneliusroemer · 2022-08-02T11:43:28Z

(This is probably not really the right repo to open this issue but I couldn't find a better one, please move if you know of a better place @tsibley)

Context

I noticed that brotli is much better at compressing auspice trees than gzip and checked if we were using brotli for downloading resources from AWS. Turns out we don't.

Description

It would be great if we supported brotli compression as the default compression for trees downloaded from AWS (ncov-data, etc.)

Examples

Compression using brotli is much better, see here for the Nextclade reference build with 4k tips:

Brotli compresses 4x better than gzip.

Possible solution

Apparently it's not too hard to enable brotli compression on the AWS end: https://aws.amazon.com/about-aws/whats-new/2020/09/cloudfront-brotli-compression/

We may also need to change charon request headers, though. Not sure where these are set.

I think we should try to use brotli wherever possible, also for things like auspice jsons. It generally does better than gzip.

Finally, we could also consider using brotli compression for nextstrain remote download - though there it's of less need, I think.

The text was updated successfully, but these errors were encountered:

tsibley · 2022-08-22T23:45:09Z

A couple thoughts:

It should be possible to swap gzip for brotli, but we'll have to support a mix of the two for a long time (potentially ~forever) because it will be impossible to coordinate all sources.

While just looking at the compression benchmarks in isolation makes for clear benefits, it's not clear to me that swapping is worth it with full consideration of the effort involved (e.g. time to engineer the swap (plan, write, test, etc), ongoing complexity of supporting both, opportunity cost of working on this instead of something else, etc).

We don't use CloudFront's dynamic compression, as not all access goes through CloudFront: a lot goes directly to S3. So we pre-compress and store compressed objects on S3. IIRC, CloudFront's dynamic compression also has (or used to have at least?) fairly low upper limits on the uncompressed size it supports.

corneliusroemer added the enhancement New feature or request label Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Use brotli compression by default if possible for nextstrain.org requests #214

ENH: Use brotli compression by default if possible for nextstrain.org requests #214

corneliusroemer commented Aug 2, 2022

tsibley commented Aug 22, 2022

ENH: Use brotli compression by default if possible for nextstrain.org requests #214

ENH: Use brotli compression by default if possible for nextstrain.org requests #214

Comments

corneliusroemer commented Aug 2, 2022

Context

Description

Examples

Possible solution

tsibley commented Aug 22, 2022