Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor agnostic #5

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

vendor agnostic #5

wants to merge 5 commits into from

Conversation

bjarthur
Copy link
Collaborator

@bjarthur bjarthur commented Apr 25, 2023

fixes #1.

not merged yet because benchmarks are slower by ~10%:

Screenshot 2023-04-25 at 4 35 39 PM

the huge regression in batched_dot can partially be fixed by specifying CUDABackend(prefer_blocks=true), but this then is not vendor agnostic. see https://discourse.julialang.org/t/kernelabstractions-get-backend-keyword-arguments/97895

@bjarthur
Copy link
Collaborator Author

second pass at KA:

Screenshot 2024-06-11 at 2 35 19 PM

num threads hard-coded at 32 in the first (only) dimension to maximize block utilization mostly alleviates regression in bdot.

see JuliaGPU/KernelAbstractions.jl#479

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor to be vendor agnostic
1 participant