This is the first in a series of courses/workshops/information sessions taking place over two semesters (2022 fall and 2023 winter). All of these cover a wide range of topics pertaining to advanced
research computing (ARC). The format of each includes live sessions and labs using
Zoom and self-paced learning content.
This course introduces the topic of supercomputing to users discussing involved concepts and aspects concerning running jobs in parallel on clusters, using clouds, etc. There will be three live sessions taking place on Mon., Sept. 26; Wed., Sept. 28; and Thurs., Sept. 29 from 2pm to 3pm Eastern Time.
Recordings of live classes will be available for those who can't attend or would like to study the course materials in their own time.
Oct. 6, 2022 update: This course was renamed from "2022-2023 Introduction to Advanced Research Computing (ARC)" to "2022-2023 Introduction to Supercomputing" and each topic/module in the original course has been moved to each be in their own course rather than being a section in one large course. While this means one will have to register in each course one is interested in, it also means, for example, that one won't receive emails, etc. for courses one is not interested in. (To see a listing of all courses see the course listing in the 2022-2023 Courses category.)- Teacher: Baolai Ge
- Teacher: Paul Preney
Running programs on the supercomputers is done via the Bash shell. This course is three one hour lectures on using BASH. No prior familiarity with BASH is assumed. In addition to the basics of getting around, globbing, regular expressions, redirection, pipes, and scripting will be covered.
Live classes will take place on Mon., Oct. 3, Wed., Oct. 5, and Thurs., Oct. 6 from 2pm-3pm Eastern Time. Recordings and exercises are available for self-paced learning.
Oct. 6, 2022 update: Originally there was a single course called, "2022-2023 Introduction to Advanced
Research Computing (ARC). Each topic/module
in the original course has been moved to be in their own course
rather than being a section in one large course. This course is one of those courses. (Information has been preserved concerning enrolment and attendance.) To see a listing of all courses see the course listing in the 2022-2023 Courses category.
- Teacher: Tyson Whitehead
This two-session workshop will provide an overview of running Jupyter notebooks on the remote compute clusters of the Digital Research Alliance of Canada. The first session will introduce the available options and features for working with Jupyter on the clusters. The second session will cover how to visualize job properties and do some analysis using Jupyter notebooks that are interactive with the Slurm scheduler on the cluster.
Live classes will take place on Wed., Oct. 12 and Thurs., Oct. 13 from 4pm-5pm. Recordings and exercises will be also available for self-paced learning and enrolments after the live classes have taken place.
- Teacher: James Desjardins
- Teacher: Jinhui Qin
Python has become one of the most popular programming languages in scientific computing. It is high level enough that learning it is easy, and coding with it is significantly faster than other programming languages. However, the performance of pure Python programs is often suboptimal which might hinder your research. This course will demonstrate some ways to identify performance bottlenecks, improve slow blocks of code, and how to extend Python with compiled code. You'll learn various ways to optimise and parallelise Python programs, particularly in the context of scientific and high performance computing.
Knowledge Prerequisites: know what classes and functions are, familiarity with Jupyter Notebook, basic console use; and comfortable with Python software carpentry material.
Live classes will take place on Mon., Oct. 17, Wed., Oct. 19, Thurs., Oct. 20 and Mon., Oct. 24 from 4pm-5pm. Recordings and exercises will be also available for self paced learning.
- Teacher: Tyler Collins
- Teacher: Pawel Pomorski
This unit will cover modern C++ programming including: using regular expressions and <filesystem>
operations; interacting with C and Fortran code and using libraries; using <chrono>
to obtain timings at run-time; using <algorithm>
, <numeric>
;
as well as vectorized operations; using const, constexpr, etc.
effectively to improve run-time performance; as well as how to write
parallel (multithreaded) code in modern C++. Sessions will have
(optional-to-do) follow-up activities to further one's knowledge,
know-how, and/or experiences with such.
There will be six (6) live
sessions (1h each) along with materials and exercises for such.
Recordings of the live sessions will be posted so if one is not able to
attend, or, enrols in the course during or after those sessions, one
will be able to engage with this unit's materials and activities.
The live sessions will take place on Wed., Oct. 26, Thurs., Oct. 27, Mon., Oct. 31, Wed., Nov. 2, Thurs., Nov. 3, and Mon., Nov. 7 from 4pm to 5pm Eastern Time.
Prerequisite knowledge: Experience writing C or C++ programs.
- Teacher: Baolai Ge
- Teacher: Paul Preney
Despite the fact that Fortran is the the oldest computer language born in the 1950s, and numerous modern languages have evolved ever since, Fortran, as the way it is spelled today, is juvenilized and still used by many people, after undergoing significant modifications and improvement in the past two decades that are rarely seen in any other languages.
The modern Fortran has evolved to its current form significantly different from its ancestor with many new modern language features. Inherited from old FORTRAN standards and enhanced by the newer ones, Fortran is still the language of choice for scientific computing for its intrinsic support for real and complex floating point operations as well as the natural expression of multidimensional arrays. Its array handling, enhanced by index slicing, makes array processing similar or identical to its counterparts, such as MATLAB, hence superior to any other programming languages. The unified interface of intrinsic functions for all data types makes translating mathematical expressions to computer programmes a lot cleaner and easier. Further, since the standard 2008, the introduction of co-arrays has made writing parallel code that can run on multiple core computers and clusters trivial, one needs not to know the traditional message passing interface (MPI) library, nor threading. They are built into the syntax of the language itself.
This module is a companion to the Parallel Programming with C++ module, where you will find the features missing from Fortran that are more essential to general purpose computing than number crunching. We will show some of the practice of interlanguage programming, through which the shortcomings of Fortran could be complemented by interfacing with C/C++.
This module contains live classes and homework assignments.
The live sessions will take place on Wed., Nov. 9; Thurs., Nov. 10; Mon., Nov. 14; Wed., Nov. 16; Thurs., Nov. 17; and Mon., Nov. 21 from 4pm to 5pm Eastern Time.
- Teacher: Baolai Ge
This course is an introduction to GPU programming, focusing on OpenACC. The course will consist of two lectures and two labs (hands on practice on advanced NVIDIA Tesla GPUs). We will start by discussing GPU architectures and basic principles of programming GPUs. Then we will introduce basic OpenACC constructs using simple code examples: SAXPY, Julia set, reduction, and Jacobi solver. A significant attention will be given to making the code efficient. The course will have two home assignments.
Prerequisites: some experience with C/C++ programming.
Live classes will take place on Wed., Nov. 23 (lecture), Thurs., Nov. 24 (lab), Mon., Nov. 28 (lecture), and Wed., Nov. 30 (lab), from 4pm-5pm EDT. Recordings and exercises will be also available for self paced learning.
- Teacher: Sergey Mashchenko
- Teacher: Pawel Pomorski
Message Passing Interface (MPI) is the dominating programming model for parallel programming cluster and distributed memory architectures. This module gives an introduction to the basics of MPI. Learners will learn the fundamentals of MPI, how to write simple to intermediate level MPI programs and have an exposure to some advanced topics. The learning materials and exercises are in C and Fortran. Learners are expected to be proficient in any of the two languages and the Linux environment.
The live sessions will take place on Thurs., Dec. 1; Mon., Dec. 5; Wed., Dec. 7; Thurs., Dec. 8; Mon., Dec. 12; and Wed., Dec. 14 from 4pm to 5pm Eastern Time.
- Teacher: Baolai Ge
- Teacher: Paul Preney
Apptainer (formerly called Singularity) is a container technology suitable for use on compute clusters. Within an Apptainer container, one can run programs within the environment of another Linux distribution, make use of persistent filesystem overlays/images that appear as single-large files to the cluster outside of the container (e.g., keep file count quotas lower / increase efficiency of storing lots of small files; installing and using Conda, etc.), and more. This course will discuss how to use Apptainer on our compute clusters including some use cases such as using Apptainer to install and use Conda.
The live sessions will take place on Mon., Jan. 9; Wed., Jan. 11; and Thurs., Jan. 12 from 4pm to 5pm Eastern Time.
- Teacher: Paul Preney
Some common libraries for data science in Python, such as Numpy, Pandas, Scikit-Learn, etc. usually work well if the dataset fits into the RAM on a single machine. When dealing with large datasets, it could be a challenge to work around memory constraints. This course module provides an introduction to scalable and accelerated data science with Dask and RAPIDS. Dask provides a framework and libraries that can handle large datasets on a single multi-core machine or crossing multiple machines on a cluster; while RAPIDS can help to offload analytics workloads to GPUs to accelerate your data science and analytics toolchain with minimal code changes.
The live sessions will take place on Mon., Jan. 16; Wed., Jan. 18; and Thurs., Jan. 19 from 4pm to 5pm Eastern Time.
- Teacher: Jinhui Qin
This workshop will introduce the
tools for debugging and profiling programs, with the focus on parallel
programs. Through a series of hands-on exercises we will learn to find
and fix common bugs in MPI, OpenMP, and CUDA programs, and improve
efficiency of parallel codes.
The live sessions will take place on Mon., Jan. 30; Wed., Feb. 1; Thurs., Feb. 2; and Mon., Feb. 6 from 4pm to 5pm Eastern Time.
Recordings and exercises will be also available for self paced learning.
- Teacher: Sergey Mashchenko
ParaView
is a very powerful temporal 3D visualization program. The Graham VDI
system is a remote desktop interface for doing graphical pre- and
post-processing on files stored on graham. This course consists of a
general introduction to the Graham VDI system and ParaView via a live
demonstration that you can follow along. There will be small exercises
where you get a chance to see if you have actually understood the
concepts.
The live sessions will take place on Wed., Feb. 8 and Thurs., Feb. 9 from 4pm to 5pm Eastern Time.
Recordings and exercises will be also available for self paced learning.
- Teacher: Tyson Whitehead
This is an introductory course on deep learning, a revolution
in AI that has made breakthroughs in many applications ranging from
image recognition, voice recognition, to natural languages
understanding, etc. The
course is aimed at helping AI practitioners at all levels to have a
deeper understanding of the fundamentals of this new technique through
various examples. Whenever possible visualizations are utilized to help
the audience to visually understand the training processing, what is
learned. We will also demonstrate how to use the Debugger that comes
from Tensorboard to debug neural networks that do not converge in
training.
Prerequisites: Basic Python knowledge.
The live sessions will take place on Mon., Feb. 13; Wed., Feb. 15; and Thurs. Feb. 16 from 4pm to 5pm Eastern Time.
Recordings and exercises will be also available for self paced learning.
- Teacher: Weiguang Guan
NOTE: Due to a family emergency of one of the instructors, the three live
classes for the Julia module next week are cancelled. This module will
be rescheduled for the summer school this summer. We sincerely apologize
for the inconvenience and we are looking forward to meeting you in the
future.
Julia is becoming increasingly popular for scientific computing. One may use it for prototyping as Matlab, R and Python for productivity, while gaining the same performance as compiled languages such as C/C++ and Fortran. The language is designed for both prototyping and performance, as well as simplicity. This is an introductory module on julia. Students will be able to get started quickly with the basics, in comparison with other similar languages such as Matlab, R, Python and Fortran and move on to learn how to write code that can run in parallel on multi-core and cluster systems through examples.
The live sessions will take place on Mon., Feb. 27; Wed., Mar. 1; and Thurs., Mar. 2 from 4pm to 5pm Eastern Time.
- Teacher: Ed Armstrong
- Teacher: Baolai Ge
This is an introductory mini course for MATLAB on Alliance clusters, with the focus on Parallel Computing Toolbox (PCT) and MATLAB Compiler and Runtime libraries (MCR).
In the lecture, we will talk about the current status of MATLAB, license issues, and job files on the clusters, followed by the approach of parallel computing with PCT including parfor for parallel for loop, and spmd for parallel tasks, etc., and the procedure to use MCR with examples. Some screenshots will be used in the slides to show the step-by-step approaches. The second session will be a lab type with demos, self practice, Q&A, etc..
The goal of this course is to run MATLAB more efficiently on the clusters.
The live sessions will take place on Mon., Mar. 20 and Thurs., Mar. 23 from 4pm to 5pm Eastern Time.
- Teacher: Jemmy Hu
This "course" is a type of commons area for all 2022-2023 SHARCNET Training courses. Its purposes include facilitating the posting of messages, etc. to all persons enrolled in any of these courses. Normally there will be little to no activity in this "course".
- Teacher: Ed Armstrong
- Teacher: Tyler Collins
- Teacher: James Desjardins
- Teacher: Baolai Ge
- Teacher: Weiguang Guan
- Teacher: Jemmy Hu
- Teacher: Sergey Mashchenko
- Teacher: Pawel Pomorski
- Teacher: Paul Preney
- Teacher: Jinhui Qin
- Teacher: Tyson Whitehead