GitHub - christam96/CUDA: Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.

christam96 / CUDA Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Parallelizing matrix multiplication to invert lower triangular matrices as well as the Floyd-Warshall algorithm to solve the all pairs shortest paths problem. These projects saw a speedup factor of 500x over their serial counterpartsf or 2048x2048 size matrices.

0 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
1		1
2		2
3		3
4		4
5		5
6		6
7		7
8		8
9		9
sol		sol
README		README

Repository files navigation

simple_examples/README
========================


Subdirectories 
---------------

1/  vector_add.cu  -> vector addition

2/  matrix_mul.cu  -> matrix multiplication 

3/  moveArrays.cu 
    CUDA, Supercomputing for the Masses: Part 1

4/  incrementArray.cu
    CUDA, Supercomputing for the Masses: Part 2


5/  reverseArray_multiblock.cu
    CUDA, Supercomputing for the Masses: Part 3
    Error handling and global memory performance limitations

6/ arrayReversal_multiblock_fast.cu
   CUDA, Supercomputing for the Masses: Part 3
   Error handling and global memory performance limitations


7/  memset.cu -> Memory banswith test

8/  simpleCUDA.cu
    This simple code sample demonstrates how to perform a simple linear
    algebra operation using CUDA, single precision axpy:
    y[i] = alpha*x[i] + y[i] for x,y in R^N and a scalar alpha
    http://mags.acm.org/queue/20080304/

9/ atomic2.cu -> compute the index of first nonzero entry of an array