Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory benchmarks for unexpected gl_SubgroupSize #44

Merged
merged 1 commit into from
Nov 27, 2023
Merged

Commits on Nov 24, 2023

  1. Fix memory benchmarks for unexpected gl_SubgroupSize

    Some Intel GPUs have flexible subgroup sizes.
    subgroupSize can be 32 but minSubgroupSize can be smaller.
    In this case, unless you forcibly control the subgroup size
    at pipeline creation time, gl_SubgroupSize will report 32 but
    the actual number of invocations in the subgroup may be 8.
    
    In the memory benchmarks, use a bitcount of the ballot to compute
    the dynamic (actual) size of the subgroup.  The alternative is
    to use the much more recent (and less portable) subgroup size
    control extension.
    
    Fixes: #43
    dneto0 committed Nov 24, 2023
    Configuration menu
    Copy the full SHA
    f44be0e View commit details
    Browse the repository at this point in the history