Skip to content

sc3260s17/vectorization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Autovectorization

This simple code demonstrates autovectorization for a vector addition routine.

Building

Ensure that you have the intel compiler available in your environment:

setpkgs -a intel_cluster_studio_compiler 

To build run:

make

to delete the executable:

make clean

Running

The program accepts one command line argument, the length of the vector. For example:

./vec_add_icc 1000

A bash script is included that will time the execution time and pass in a vector length of 1000000000 by default. Just type:

bash run.sh

Exercises

  1. Build the binary with and without vectorization enabled. Building with level-three optimization will ensure autovectorization is enabled. Verify that the for loop is vectorized by reading the vectorization report generated by the compiler. Time the execution of both versions of the binary for a vector length of 1000000000.

  2. Compare the walltime with versions of the binary that are built without vectorization enabled. Using optimization level zero will ensure that vectorization is disabled.

  3. Convert the for loop to a while loop. Does it still vectorize?

  4. Move the actual vector addition step (c = a + b) into a second for loop. With level three optimization enabled, how does the autovectorization behavior compare (read reporting carefully)? How does the walltime compare?

  5. Revert back to the original code. Now start the loop at i=1 (ignore the first element for the vector addition) and add this line to the end of the loop body:

c[i] = c[i-1] + 82.3;

What changes? Why?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published