This is an introductory graduate course on parallel computing.
Upon completion, you should
be able to design and analyze parallel algorithms for a variety of
problems and computational models,
be familiar with the hardware and software organization of
high-performance parallel computing systems, and
have experience with the implementation of parallel applications
on high-performance computing systems, and be able to measure,
tune, and report on their performance.
Questions, answers, and discussions outside of lectures
We will use Piazza for asynchronous discussions outside of class. The service is purchased, so you don't need to make a contribution.
I have uploaded your email of record with the registrar to login. If you prefer to use another login,
I believe you can add another login for our discussion group using this link:
http://piazza.com/unc/fall2021/comp633.
Be sure to sign up using a unc email.
(For Tue Nov 9) Skim
MPI tutorial by Blaise Barney, LLNL.
(For Tue Oct 26)
Skim the Questions and Answers about BSP
pp 1-25. We will not use BSPLib directly, rather we use the BSP model together
with communication operations from the MPI library.
(For Tue Sep 21)
Look through section 8 of the OpenMP Tutorial
(For Tue Sep 14) Look through OpenMP Tutorial
sections 3-5 and section 6 only up to the first exercise. Most examples are
shown in C/C++ and in Fortran, so read examples using whichever language you prefer.
phaedra is an Intel Xeon E5-2650v4 compute server dedicated to this class. It has 20 cores
and an attached Nvidia Titan V100 accelerator. OpenMP, Cilk, and Cuda programming models
are supported. Login with your onyen (instead of a CS login).
longleaf is a research computing cluster with ~350 Intel Xeon E5-2643 nodes
providing 24 cores per node.
OpenMP and Cilk programming models are supported on individual nodes.
Compute jobs are submitted using slurm.
dogwood is a research computing cluster with ~240 Intel Xeon E5-2699A nodes
providing 44 cores per node.
The MPI programming model is supported to coordinate and communicate among nodes.
A subset of the nodes have Intel Xeon Phi (KNL) accelerators.
The individual nodes support MPI, OpenMP, OpenACC, and Intel offload (Intel Xeon Phi)
programming models.
Compute jobs are submitted using slurm.
Available Tools
All students in COMP 633 can login on phaedra.cs.unc.edu using their onyen.
Compilers
GNu compiler (gcc/g++ 11.2.0)
supports OpenMP 5.1
to use gcc/g++ on phaedra make sure you have /usr/local/gcc/ on your path.
Intel C/C++ compiler (icc/icpc 2020) [Note: not available yet, use gcc for the time being]
supports OpenMP 4.5 with tasking and accelerator offload.
On phaedra, source /opt/intel/bin/compilervars.sh intel64 (bash) or
source /opt/intel/bin/compilervars.csh intel64 (csh) to access the Intel compilers and tools.
On research computing clusters use "module add icc" to access Intel compilers.
OpenMP
Shared memory parallel programming. Specification of the
OpenMP 4.5 API for C/C++
For a more accessible introduction see the tutorial for OpenMP 3.1 in the Bibliography below.
Accelerators
Nvidia GPUs: programmed using Cuda C (Compute Capability 9.2 for V100 on phaedra).