Introduction to Parallel Programming with OpenCL
Duration: 5 Days
Course Background
Open Computing Language (OpenCL) is a framework for implementing programs that can run across heterogeneous platforms made up of e.g. central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors. OpenCL defines a language (based on C99) for writing kernels (functions that execute on OpenCL devices), and application programming interfaces (APIs) which can be used to define and then control the platforms. OpenCL provides parallel computing using task-based and data-based parallelism. OpenCL is a widley used open standard. OpenCL programs can be compiled into application-specific processors running on FPGAs. OpenCL can also be used as an intermediate language for directives-based programming using OpenACC This course aims to provide an intensive practical introduction to OpenCL.
Course Prerequisites and Target Audience
Attendees should be experienced C/C++ programmers with a sound knowledge of
- Pointers and pointer operations
- Computing using multidimensional arrays
- File I/O and file and directory manipulation
- Data structures and classes
- Collection classes such as linked lists and vectors
- Multithreading
- Memory allocation and memory management
- Computer architectures and instruction sets
- Basic techniques for code profiling and code optimisation
- Code debugging
Course Outline
- Overview of GPU Computing
- Introduction to the OpenCL Framework
- Data-Parallel Architectures and the OpenCL Programming Model
- Work-item Cooperation and the OpenCL Memory Model
- Getting started with OpenCL - Buffer Allocation and Buffer Transfers, Simple Kernels and Local and Constant Memory
- Task Concurrency and Synchronization
- Data-Parallel Architectures and the OpenCL Programming Model
- Work-item Cooperation and the OpenCL Memory Model
- Getting started with OpenCL - Buffer Allocation and Buffer Transfers, Simple Kernels and Local and Constant Memory
- Asynchronous operations programming
- GPU programming debugging
- Performance
- Latency and the OpenCL + GPU Execution Model
- Arithmetic Optimizations
- Memory Optimizations
- Identifying and avoiding memory bank conflicts
- Case studies
- Pattern Matching
- Simulation
- 3D Image processing
- OpenCL and Hadoop - HadoopCL
- Running OpenCL on CUDA capable GPUs