This is a common area for all 2020 Summer School course registrants. Everyone that is registered in a Summer School 2020 course is automatically granted access to this common area.
- SHARCNET Staff: Ed Armstrong
- SHARCNET Staff: Tyler Collins
- SHARCNET Staff: James Desjardins
- SHARCNET Staff: Baolai Ge
- SHARCNET Staff: Jose Sergio Hleap
- SHARCNET Staff: Jemmy Hu
- SHARCNET Staff: Sergey Mashchenko
- SHARCNET Staff: Pawel Pomorski
- SHARCNET Staff: Paul Preney
- SHARCNET Staff: Jinhui Qin
- SHARCNET Staff: Armin Sobhani
- SHARCNET Staff: Tyson Whitehead
- SHARCNET Staff: Isaac Ye
This course is a series of short live-demo sessions interwoven with hands-on exercises. The goal is to give you the skills required to use Canada's research High-Performance Computing (HPC) resources.
Specifics topics that will be covered (time allowing) are:
- finding information on Canadian supercomputers
- getting an account
- transferring files
- starting a command-line session
- moving around and looking at things
- accessing software
- scheduling programs to run
- reading and writing files
- using wildcards and pipes
- automating tasks with loops
- writing scripts (collections of commands)
- searching for things
- writing and using regular expressions
Unlike a graphical user-interface, the command line is well suited for automation, so these skills are also useful outside of HPC.
- Teacher: Paul Preney
- Teacher: Tyson Whitehead
This course is a series of short live-demo sessions interwoven with hands-on exercises. The goal is to give you the skills required to use Canada's research High-Performance Computing (HPC) resources.
Specifics topics that will be covered (time allowing) are:
- finding information on Canadian supercomputers
- getting an account
- transferring files
- starting a command-line session
- moving around and looking at things
- accessing software
- scheduling programs to run
- reading and writing files
- using wildcards and pipes
- automating tasks with loops
- writing scripts (collections of commands)
- searching for things
- writing and using regular expressions
Unlike a graphical user-interface, the command line is well suited for automation, so these skills are also useful outside of HPC.
- Teacher: Paul Preney
- Teacher: Tyson Whitehead
This two-day workshop will focus on understanding and using modern C++ concurrency features to facilitate writing parallel (threaded) code. This course will give an overview of the C++ memory models, parallel algorithms, synchronization, and other concurrency constructs with a strong focus on keeping code straight-forward to design, write, and reason about.
Target audience: Researchers who are or will be using C++.
Instructor: Paul Preney, SHARCNET / University of Windsor
Pre-requisite knowledge: The attendee is expected to know how to write serial procedural code in C/C++.
- Teacher: Paul Preney
Python has become one of the most popular programming languages in scientific computing. It is high level enough that learning it is easy, and coding with it is significantly faster than other programming languages. However, the performance of pure Python programs is often suboptimal, and might hinder your research. In this course, we will show you some ways to identify performance bottlenecks, improve slow blocks of code, and extend Python with compiled code. You’ll learn various ways to optimise and parallelise Python programs, particularly in the context of scientific and high performance computing.
REQUIREMENTS You must be comfortable with plain python. That includes: 1. Knowing what is a class and a function 2. Familiar with Jupyter notebooks and basic console 3. Comfortable with the python software carpentry material
- Teacher: Tyler Collins
- Teacher: Jose Sergio Hleap
This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. The course covers some new features available on GPUs installed on Graham and Cedar. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications.
Prerequisites: C/C++ scientific programming, experience editing
and compiling code in a Linux environment. Some experience with CUDA
and/or OpenMP a plus.
- Teacher: Sergey Mashchenko
- Teacher: Pawel Pomorski
In this course, students will learn the high level, high performance language julia from the basics of programming features to advanced topics on parallel and distributed computing in several sessions.
Julia is getting increasingly popular for scientific computing. One may use it for prototyping as Matlab, R and Python for productivity, while gain the same performance as compiled languages such as C/C++ and Fortran. The language is designed for both prototyping and performance, as well as simplicity.
This is an introductory course on julia. Students will be able to get started quickly with the basics, in comparison with other similar languages such as Matlab, R, Python and Fortran and move on to learn how to write code that can run in parallel on multi-core and cluster systems through examples.
This course is delivered in several sessions through live classes via Zoom and self-paced learning. Students who missed some of the live classes at scheduled time slots may learn at their own pace. Students may ask questions and get answers in the forum for the course.
Students who complete the course activities including quiz and exercises will receive a certificate for the completion of the courses. Attendance and the completion of the course activities are recorded.
- Teacher: Ed Armstrong
- Teacher: Baolai Ge
This is a two-day course on Machine Learning. It consists of two major subjects: Machine Learning in Data Science and Deep Learning towards Artificial Intelligence.
Day 1 will provide an introduction to Machine Learning in Data Analytics. Following an overview of Machine Learning in Data Science, we will learn from examples for a full cycle of Machine Learning with real data, including data cleansing, data exploration, and a typical Machine Learning workflow. We will use Pandas and Scikit-Learn APIs in Python with hands-on practices through Jupyter Notebook.
Day 2 will provide an overview of Artificial Intelligence with a focus on Deep Learning (DL) and Deep Neural Networks (DNN). We will have hands-on tutorials on two major different DL frameworks - Tensorflow and PyTorch by running CIFAR-10 image classification example on Compute Canada system. The hands-on labs will focus on how to run a sample case with different hardware resources such as a single-GPU, multi-GPU and multi-GPU nodes while the basic Tensorflow/PyTorch programming is briefly introduced.
Instructors: Jinhui Qin, SHARCNET / Western University (day 1) and Isaac Ye, SHARCNET / York University (day 2)
Prerequisites: Some programming experience in Python and background in statistics would be helpful but not mandatory.
- Teacher: Jinhui Qin
- Teacher: Isaac Ye
MPI is a standardized and portable message-passing interface for parallel computing clusters. The standard defines the syntax and semantics of a core of library routines useful to a wide range of scientific programs in C/C++ and Fortran. MPI's goals are high performance, scalability, and portability. It is the dominant parallel programming model used in high-performance-computing today.
In this two-day session, through lectures interspersed with hands-on labs, the students will learn the basics of MPI programming. Examples and exercises will be based on parallelization of common computing problems.
Instructor: Jemmy Hu, SHARCNET, University of Waterloo, Ge Baolai, SHARCNET, Western University.
Prerequisites: Basic C/C++ and/or Fortran knowledge; experience editing and compiling code in a Linux environment.
Bioinformatics is a multidisciplinary field that develops methods for turning the so-called biological big data into knowledge. Although most problems in biology are not as embarrassingly parallelizable as the physics codes that HPC systems are usually designed for, this has been starting to change in recent years. Metagenomics is an emerging field in bioinformatics that investigates microbes inhabiting oceans, soils, human body, etc. with sequencing technologies.
In this two-day hands-on session, a typical metagenomics pipeline will be explored to introduce common tools used in bioinformatics analysis and how they can be utilized in an HPC environment.
Instructors:
- DAY ONE: Jose Sergio Hleap, SHARCNET, University of Guelph
- DAY TWO: Armin Sobhani, SHARCNET, Ontario Tech University
Prerequisites: Basic Unix competence and basic knowledge of bioinformatics applications are required
- Teacher: Jose Sergio Hleap
- Teacher: Armin Sobhani
The majority of production work on the Compute Canada systems like Graham are dispatched to the compute nodes via the Slurm scheduler. The Compute Canada General Purpose (GP) systems (Beluga, Cedar, and Graham) are heterogeneous in that they have various node types with hardware options ranging in core count, memory availability, and GPUs, not to mention several model generations. The systems were also designed to accommodate a highly varied workload structures from the Canadian academic research community, including various job run times, CPU, GPU and memory sizes, as well as interactive workloads. The Slurm scheduler configuration is designed to maximize fairness, responsiveness and overall utilization. Beyond understanding how to request the appropriate resources required for a job, an understanding of the system specific configurations can have significant impact on the time to result on these systems by minimizing wait times in the queue. After covering job submission techniques this course provides information about monitoring jobs.
- Teacher: James Desjardins
The focus of this course is on Graham usage other than batch computing via the scheduler. Although the majority of production work on the Compute Canada clusters is executed as unsupervised compute jobs dispatched via the Slurm scheduler, large scale research projects involve several other computational tasks that precede and follow the batch production work. Some of these tasks involve file management, software development and prototyping, performance profiling, data sharing and visualization. This course demonstrates common ways for accessing the cluster with interactive capabilities for achieving the various pre- and post-production tasks associated with computationally demanding research projects.
- Teacher: James Desjardins