-
Notifications
You must be signed in to change notification settings - Fork 0
Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.
christam96/CUDA
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
simple_examples/README ======================== Subdirectories --------------- 1/ vector_add.cu -> vector addition 2/ matrix_mul.cu -> matrix multiplication 3/ moveArrays.cu CUDA, Supercomputing for the Masses: Part 1 4/ incrementArray.cu CUDA, Supercomputing for the Masses: Part 2 5/ reverseArray_multiblock.cu CUDA, Supercomputing for the Masses: Part 3 Error handling and global memory performance limitations 6/ arrayReversal_multiblock_fast.cu CUDA, Supercomputing for the Masses: Part 3 Error handling and global memory performance limitations 7/ memset.cu -> Memory banswith test 8/ simpleCUDA.cu This simple code sample demonstrates how to perform a simple linear algebra operation using CUDA, single precision axpy: y[i] = alpha*x[i] + y[i] for x,y in R^N and a scalar alpha http://mags.acm.org/queue/20080304/ 9/ atomic2.cu -> compute the index of first nonzero entry of an array
About
Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published