Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASTC weights SIMD encoding #298

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ronanbel
Copy link

@ronanbel ronanbel commented May 6, 2022

ssse3 (I5 6300) : 163 => 136 ms
arm (A53) : 340 => 282 ms
I moved the block weight transform code in a single function : pack_astc_block_weights
you can enable/disable the SIMD code with a define
BASISD_ASTC_SIMD
All the simd code is annotated

tested x86_64 on windows, compiled with VS2019 and clang 11
tested arm & arm64 on android, compiled with latest NDK (clang11)

if needed, you can get in touch at :
[email protected]
[email protected]

@richgel999
Copy link
Contributor

Thank you - this is great. I normally shy away from merging code that I can't easily maintain, but let me see what I can do. How much does this help encoding perf.?

@richgel999 richgel999 added the enhancement New feature or request label May 12, 2022
ssse3 (I5 6300) : 163 => 136 ms
arm (A53) : 340 => 282 ms
I moved the block weight transform code in a single function : pack_astc_block_weights
you can enable/disable the SIMD code with a define
BASISD_ASTC_SIMD
All the simd code is annotated

tested x86_64 on windows, compiled with VS2019 and clang 11
tested arm & arm64 on android, compiled with latest NDK (clang11)

if needed, you can get in touch at :
[email protected]
[email protected]
fix previous issue + optimize unpack
@ronanbel
Copy link
Author

ronanbel commented May 12, 2022 via email

@ronanbel
Copy link
Author

ronanbel commented May 12, 2022 via email

@ronanbel
Copy link
Author

ronanbel commented May 12, 2022 via email

Comment on lines +12580 to +12581
uint16x8_t bitMask0 = vshlq_u16( vdupq_n_u16(1), bitNum0 ); // bitMask = (1U << n) - 1U
uint16x8_t bitMask1 = vshlq_u16( vdupq_n_u16(1), bitNum1 ); // bitMask = (1U << n) - 1U

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails on GCC without -flax-vector-conversions:

../subprojects/basis_universal/transcoder/basisu_transcoder.cpp: In function 'void basist::pack_astc_block_weights(uint8_t*, const uint8_t*, int, int)':
../subprojects/basis_universal/transcoder/basisu_transcoder.cpp:12010:80: note: use '-flax-vector-conversions' to permit conversions between vectors with differing element types or numbers of subparts
12010 |                                 uint8x8_t       rev8 = vqmovn_u16( vcombine_u16( rev8lohi, vdup_n_u8(0) ) );                    //      8bits in 4 u8 (clear lower 32)
      |                                                                    ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
../subprojects/basis_universal/transcoder/basisu_transcoder.cpp:12010:101: error: cannot convert 'uint8x8_t' to 'uint16x4_t'
12010 |                                 uint8x8_t       rev8 = vqmovn_u16( vcombine_u16( rev8lohi, vdup_n_u8(0) ) );                    //      8bits in 4 u8 (clear lower 32)
      |                                                                                            ~~~~~~~~~^~~
      |                                                                                                     |
      |                                                                                                     uint8x8_t

An explicit cast fixes it:

Suggested change
uint16x8_t bitMask0 = vshlq_u16( vdupq_n_u16(1), bitNum0 ); // bitMask = (1U << n) - 1U
uint16x8_t bitMask1 = vshlq_u16( vdupq_n_u16(1), bitNum1 ); // bitMask = (1U << n) - 1U
uint16x8_t bitMask0 = vshlq_u16( vdupq_n_u16(1), (int16x8_t)bitNum0 ); // bitMask = (1U << n) - 1U
uint16x8_t bitMask1 = vshlq_u16( vdupq_n_u16(1), (int16x8_t)bitNum1 ); // bitMask = (1U << n) - 1U

@ronanbel
Copy link
Author

ronanbel commented Apr 26, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants