Fix memory benchmarks for unexpected gl_SubgroupSize #44
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some Intel GPUs have flexible subgroup sizes.
subgroupSize can be 32 but minSubgroupSize can be smaller. In this case, unless you forcibly control the subgroup size at pipeline creation time, gl_SubgroupSize will report 32 but the actual number of invocations in the subgroup may be 8.
In the memory benchmarks, use a bitcount of the ballot to compute the dynamic (actual) size of the subgroup. The alternative is to use the much more recent (and less portable) subgroup size control extension.
Fixes: #43