-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use boxed slice for lookup table to avoid stack overflow #5212
Conversation
In practice it would not have caused problems since the 255 case is handled separately, but we may as well fix it.
Preview available at https://egui-pr-preview.github.io/pr/5212-box-color32-lut |
@lucasmerlin can you test? This is slightly slower according to the benchmark (about 25%), but it should hopefully avoid stack problems. I'm actually not sure where the compiler was putting the array before. I heard something about objects always being created in the stack and then moved to the heap or to where they are supposed to be (although sometimes this is optimized away), so I went with a boxed slice rather than a boxed array to be sure. I think this is called the "placement new" issue? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, just tried it and it fixes the problem!
The following also worked, and I think it's a bit cleaner (and closer to your original implementation):
static LOOKUP_TABLE: OnceLock<Box<[u8; 256 * 256]>> = OnceLock::new();
let lut = LOOKUP_TABLE.get_or_init(|| {
use crate::{gamma_u8_from_linear_f32, linear_f32_from_gamma_u8};
Box::new(core::array::from_fn(|i| {
let [value, alpha] = (i as u16).to_ne_bytes();
let value_lin = linear_f32_from_gamma_u8(value);
let alpha_lin = linear_f32_from_linear_u8(alpha);
gamma_u8_from_linear_f32(value_lin * alpha_lin)
}))
});
Is there a reason you don't use a [u8; 256 * 256]
anymore?
I wasn't sure it would work, as I mentioned above. I thought the array would be constructed in the stack and then moved to the heap without optimizations, so it would be the same problem in the end. |
Ah I see, makes sense to me! In my tests it also worked with a boxed slice but I guess the way you implemented should be the safer one. |
Color32::from_rgba_unmultiplied
on iOS #5210