Timing CUDA OperationsΒΆ

This module was created by Joel Adams of Calvin College and extended and adapted for CSInParallel by Jeffrey Lyman in 2014 (JLyman@macalester.edu)

The purpose of this document is to teach students the basics of CUDA programming and to give them an understanding of when it is appropriate to offload work to the GPU.

Through completion of Vector Addition, multipliction, square root, and squaring programs, students will gain an understanding of when the overhead of creating threads and copying memory is worth the speedup of GPU coding.


  • Some knowledge of C coding and using makefiles.
  • An ability to create directories and use the command line in unix.
  • Access to a computer with a reasonably capable GPU card.


This activity contains three parts, linked below. First there is a short introduction to setting up code in CUDA to run on a GPU. Then you will try running vector addition code on your GPU machine. Lastly, you will experiment with various types of operations and large sizes of arrays to determine when it is worthwhile to use a GPU for general-purpose computing.