Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce wasted memory in ChunkedSliceOutput #23707

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chenyangfb
Copy link
Contributor

@chenyangfb chenyangfb commented Sep 24, 2024

Description

Currently the minimal buffer size of ChunkedSliceOutput is 8kb, this leads to lots of wasted memory with large number of output buffer, each with a few bytes. For example, with 10k output stream with 10 bytes, the total size of ChunkedSliceOutput is 80MB while the logical size is 100kb.

This PR reduce wasted memory in ChunkedSliceOutput by allowing smaller buffer size in ChunkedSliceOutput. The default value is still 8kb.

Impact

Improve memory usage and reduce wasted memory

Test Plan

Tested with Spark workload writing flat map column.

Control use 8kB chunk buffer size while test use 1kB chunk buffer.
Total memory usage from ChunkedSliceOutput reduced from 2GB to 700MB and memory usage per object reduced from ~8kB to ~2kB.

Before
Screenshot 2024-09-24 at 2 40 09 PM
After
Screenshot 2024-09-24 at 2 40 24 PM

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

General change
Allow setting chunk size in ChunkedSliceOutput, it can reduce memory in the cases of large number of small output buffer

@chenyangfb chenyangfb force-pushed the orc_output_buffer_chunk_size branch 2 times, most recently from 530f9ee to 2c1426b Compare September 24, 2024 00:54
@chenyangfb chenyangfb marked this pull request as ready for review September 24, 2024 21:47
@chenyangfb chenyangfb requested review from sdruzkin and a team as code owners September 24, 2024 21:47
@sdruzkin
Copy link
Collaborator

sdruzkin commented Sep 25, 2024

  1. How does it impact writer performance? Smaller initial chunks will result in additional 2-3 mem copy operations.
  2. The change you made will result in smaller initial size, but what would happen after a few batches are fed into the writer? In my understanding there will be no impact on reducing the memory consumption, except the perf regression because of additional mem copy.
  3. Won't it make more sense to set the initial size to 0, but then grow to 4k at the first write?

Please update the PR Release Notes.

@chenyangfb
Copy link
Contributor Author

chenyangfb commented Sep 25, 2024

Thanks @sdruzkin
This diff added support for setting chunk size, but the default size is the same as before (8kB).
Smaller initial chunks such as 1kB is aimed for cases with large number of output stream such as flat map with 10k output stream and typical bytes per output stream range from 10 bytes to a few kB.

So for those cases, since the typical bytes per output stream is small (but not zero), the writer performance impact is minimal but the memory saving is big and set the initial size to 0 won't reduce memory.

I have a few other changes for setting the config in Spark and propagate it to presto-orc, not sure if I should share it here.

I plan to do more Spark shadow testing to measure the performance impact with different chunk size.

buffer = new byte[currentSize];
currentSize = min(multiplyExact(currentSize, 2), maxChunkSize);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this unnecessary change

@@ -53,10 +53,9 @@ public class OrcOutputBuffer
private static final int INSTANCE_SIZE = ClassLayout.parseClass(OrcOutputBuffer.class).instanceSize();
private static final int PAGE_HEADER_SIZE = 3; // ORC spec 3 byte header
private static final int INITIAL_BUFFER_SIZE = 256;
private static final int MINIMUM_OUTPUT_BUFFER_CHUNK_SIZE = 4 * 1024;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. You are claiming that the old min chunk size was 8kb, but here in the Orc compression output slice it's 4kb. Can you please clarify this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants