Skip to content

Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.

Notifications You must be signed in to change notification settings

christam96/CUDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple_examples/README
========================


Subdirectories 
---------------

1/  vector_add.cu  -> vector addition

2/  matrix_mul.cu  -> matrix multiplication 

3/  moveArrays.cu 
    CUDA, Supercomputing for the Masses: Part 1

4/  incrementArray.cu
    CUDA, Supercomputing for the Masses: Part 2


5/  reverseArray_multiblock.cu
    CUDA, Supercomputing for the Masses: Part 3
    Error handling and global memory performance limitations

6/ arrayReversal_multiblock_fast.cu
   CUDA, Supercomputing for the Masses: Part 3
   Error handling and global memory performance limitations


7/  memset.cu -> Memory banswith test

8/  simpleCUDA.cu
    This simple code sample demonstrates how to perform a simple linear
    algebra operation using CUDA, single precision axpy:
    y[i] = alpha*x[i] + y[i] for x,y in R^N and a scalar alpha
    http://mags.acm.org/queue/20080304/

9/ atomic2.cu -> compute the index of first nonzero entry of an array

About

Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published