Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use boxed slice for lookup table to avoid stack overflow #5212

Merged
merged 2 commits into from
Oct 23, 2024

Conversation

YgorSouza
Copy link
Contributor

In practice it would not have caused problems since the 255 case is
handled separately, but we may as well fix it.
Copy link

github-actions bot commented Oct 2, 2024

Preview available at https://egui-pr-preview.github.io/pr/5212-box-color32-lut
Note that it might take a couple seconds for the update to show up after the preview_build workflow has completed.

@YgorSouza
Copy link
Contributor Author

@lucasmerlin can you test?

This is slightly slower according to the benchmark (about 25%), but it should hopefully avoid stack problems. I'm actually not sure where the compiler was putting the array before. I heard something about objects always being created in the stack and then moved to the heap or to where they are supposed to be (although sometimes this is optimized away), so I went with a boxed slice rather than a boxed array to be sure. I think this is called the "placement new" issue?

@YgorSouza YgorSouza marked this pull request as ready for review October 2, 2024 17:00
Copy link
Collaborator

@lucasmerlin lucasmerlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, just tried it and it fixes the problem!

The following also worked, and I think it's a bit cleaner (and closer to your original implementation):

    static LOOKUP_TABLE: OnceLock<Box<[u8; 256 * 256]>> = OnceLock::new();
    let lut = LOOKUP_TABLE.get_or_init(|| {
        use crate::{gamma_u8_from_linear_f32, linear_f32_from_gamma_u8};
        Box::new(core::array::from_fn(|i| {
            let [value, alpha] = (i as u16).to_ne_bytes();
            let value_lin = linear_f32_from_gamma_u8(value);
            let alpha_lin = linear_f32_from_linear_u8(alpha);
            gamma_u8_from_linear_f32(value_lin * alpha_lin)
        }))
    });

Is there a reason you don't use a [u8; 256 * 256] anymore?

@YgorSouza
Copy link
Contributor Author

Is there a reason you don't use a [u8; 256 * 256] anymore?

I wasn't sure it would work, as I mentioned above. I thought the array would be constructed in the stack and then moved to the heap without optimizations, so it would be the same problem in the end.

@lucasmerlin
Copy link
Collaborator

I wasn't sure it would work, as I mentioned above. I thought the array would be constructed in the stack and then moved to the heap without optimizations, so it would be the same problem in the end.

Ah I see, makes sense to me! In my tests it also worked with a boxed slice but I guess the way you implemented should be the safer one.

@lucasmerlin lucasmerlin added bug Something is broken ecolor labels Oct 7, 2024
@emilk emilk added this to the Next Patch Release milestone Oct 23, 2024
@emilk emilk merged commit f75a235 into emilk:master Oct 23, 2024
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken ecolor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EXC_BAD_ACCESS in Color32::from_rgba_unmultiplied on iOS
3 participants