Overview

At the start of 2021, SHARCNET hosted its first Introduction to Advanced Research Computing course. This is an expanded version of such taking place over two semesters ending by the summer semester in time for SHARCNET's annual summer school courses.

In this offering each topic is its own course (see below) enabling one to enrol in the topics he/she is interested in and to not have to scroll through material one is not interested in.

IMPORTANT: We recommend for those who may also be interested in taking SHARCNET Summer School courses to take the relevant courses below before attending our summer school courses as introductory materials may well not be covered and/or there will not be sufficient time to learn such within some/all SHARCNET summer school courses.


Course Offerings and When

Unless otherwise noted below, each week has two live sessions taking place on Mondays and Thursdays starting at 2pm Eastern Time. (Live sessions will vary between instruction-based and workshop/lab-based.) Each live session will typically last between 1h and 1.5h in length. Below on this page you will find each course listed chronologically with a description and (after clicking the course's title) a button permitting you to enrol in to that course.

Enrolled early? Should you enrol before the first live session, know that course content might not be visible until just before the course starts.

No longer interested? After you've enrolled yourself, should you choose to unenrol yourself from a course, on that course's main page, click on the gear icon at the top-right then click on "Unenrol me" in the drop-down menu to unenrol.

Have questions or need help? Please ask such by sending an email to help@sharcnet.ca.

This is not a course. Enrolment is automatic for everyone enrolled in any 2021-2022 Introduction to Advanced Research Computing (ARC) course.

Access is restricted to Compute Canada authenticated users only: Yes

Shells on Linux, Unix, and other POSIX operating systems are fundamental tools providing access to and control of software. Shells present to the user a command line driven text interface enabling one to interact with a computer using commands typed in. Tasks such as searching files and directories; creating, modifing, renaming, and deleting files and directories; and many other things can be easily achieved. Shells also support a programming language that allows one to execute commands either by manually typing such in or by placing those commands in a file called a shell script so those commands can be run automatically by executing that script.

This course introduces the user to the BASH shell which is the default shell for all account on Compute Canada's compute clusters.

Live Session Dates: Sept. 27, 30; Oct. 4, & 7

Access is restricted to Compute Canada authenticated users only: Yes

Scheduling software on a compute cluster is how one runs programs/jobs on that compute cluster. When jobs are submitted to the scheduler, the scheduler directs where and when those jobs run within the compute cluster. The order jobs are run is based on each job's priority calculated using that system's fair share policy. Finally, each job that runs has its requested resources tied to their account so it is able to use such when it runs.

This course discusses Compute Canada's compute cluster scheduling software (SLURM) as well as aspects of how jobs are scheduled, using interactive jobs to do (human) interactive (live) work, and how to estimate job resource requirements.

Live Session Dates: Oct. 12, 14, 18, & 21

NOTE: There is no live session on Mon. Oct. 11 due to the Thanksgiving holiday. Instead, there will be a live session on Tues., Oct. 12.


Access is restricted to Compute Canada authenticated users only: Yes

Python is an easy-to-read, easy-to-learn, general-purpose, and popular programming language. It is concise and easy to read, and allows the user to do everything from web development to software development including scientific applications. In this introductory Python course, we demonstrate some of the basics of Python programming --enough to get started. This course will be focused on Python usage on Compute Canada clusters.

Live Session Dates: Oct. 25, 28; Nov. 1 & 4.

Access is restricted to Compute Canada authenticated users only: Yes

This introductory course will focus on introducing Machine Learning (ML) in order to run ML code on Compute Canada's clusters. There will be some hands-on exercises and some discussion on how to submit ML jobs to the Graham compute cluster, ML frameworks such a PyTorch, Tensorflow, and Scikit-Learn, and a brief introduction to a ML linear regression model.

Live Session Dates: Nov. 8 & 11.

Access is restricted to Compute Canada authenticated users only: Yes

Some common libraries for data analytics in Python, such as Numpy, Pandas, Scikit-Learn, etc. usually work well if the dataset fits into the existing RAM on a single machine. However, when dealing with large datasets, it can be a significant challenge to work around such memory constraints. This is where Dask can help. Dask provides a framework and libraries that can handle large datasets on a single multi-core machine or on a cluster.

This course provides an introduction to Dask.

Live Session Dates: Nov. 15 & 18

Access is restricted to Compute Canada authenticated users only: Yes

Writing a concurrent or parallel program that can run on multiple cores or on multiple nodes/computers in a cluster is not the same as writing a sequential program for a single CPU core. Typically sequential programs are only concerned with some aspects of how information is stored in RAM and the step-by-step (sequential) correctness of its code. A concurrent program is concerned with the non-sequential reasoning of how code executes and how it can see (or not) values in RAM from other streams of execution, how it orders reads and writes on those streams of execution, and the time it takes for data in to be transferred across busses (e.g., CPU to GPU RAM, CPU to machine RAM, etc.).

This course will provide a practical overview of important concepts and details involved with concurrent programming in order to further understanding of the benefits, costs, and issues involved with exploiting and writing parallel computing programs.

Live Session Dates: Nov. 22 & 25

Access is restricted to Compute Canada authenticated users only: Yes

Many programs claim to be C++ programs turn out, upon examination, to be programs that mostly/exclusively use C constructs. While this is a valid way to use C++, it is better to write programs to leverage the features of the C++ language and the C++ Standard Library. Writing such code will result in shorter, faster, and more robust code that is easier to maintain and update.

This course will (briefly) introduce the C++ language and will discuss modern C++ concurrent programming. (Aspects of this course some will find relevant to other IntroARC courses as well, e.g., CMake (which is commonly used to easily build larger C++ programs), Fortran (which is commonly used to with C++ in mixed language programs), and MPI (which is commonly used to run (parallel) programs across a number of nodes/computers).

Live Session Dates: Nov. 30; Dec. 2, 6, 9, 13, & 16.

NOTE: This course starts on Tuesday, November 30. (There is no live session on Mon. Nov. 29.)

Access is restricted to Compute Canada authenticated users only: Yes

Since the introduction of the milestone of Fortran 90 standard, Fortran has become very different from its ancestor with many new modern language features. Inherited from old FORTRAN standards and enhanced by the newer ones, Fortran is still the language of choice for scientific computing for its intrinsic support for real and complex floating point operations. Its array handling, enhanced by index slicing, makes array processing similar or identical to its counterparts, such as MATLAB, hence superior to any other programming languages. The unified interface of intrinsic functions for all data types makes translating mathematical expressions to computer programmes a lot cleaner and easier. Further, since the standard 2008, the introduction of co-arrays has made writing parallel code that can run on multiple core computers and clusters trivial, one needs not to know the traditional message passing interface (MPI) library, nor threading. They are built into the syntax of the language itself.

This course is a companion to the C++ course, where you will find the features missing from Fortran that are more essential to general purpose computing than number crunching. We will show some of the practice of interlanguage programming, through which the shortcomings of Fortran could be complemented by interfacing with C/C++.

Live Session Dates: Jan. 10, 13, 17, 20, 24, 27

Access is restricted to Compute Canada authenticated users only: Yes

Graphics processing units (GPUs) are commonly used for high performance computing (HPC). In this introductory course, we will consider the simplest way to program GPUs, using the OpenACC framework. We will discuss both strengths and weaknesses of this approach,  when compared with full-fledged GPU programming using CUDA. The course will have some hands on exercises.

Prerequisites: some knowledge of C/C++ programming languages.

Live Session Dates: Jan. 31, Feb. 3, 7, & 10.
Access is restricted to Compute Canada authenticated users only: Yes

CMake is a cross-platform, open-source software tool used to automate the building, testing, packaging, and installation of software. It is independent of the programming language and tools used although it is most commonly used to with C, C++, and most recently Fortran code. CMake is particularly nice since it determines for a number of programming languages the dependencies between various files, e.g., header and source files, in order to properly build the software/documentation and to minimize the amount of work required to build such.

This course will introduce CMake as well how to write, run, and test CMake scripts. Especially if you are manually building your programs by hand or are experiencing difficulties using shell scripts, it is worth exploring how to use tools such as CMake to make this much easier and much less error-prone to do.

Live Session Dates: Feb. 14, 17, 22 & 24.

NOTE: There is no live session on Mon. Feb. 21 due to the Family Day holiday. Instead, there will be a live session on Tues., Feb. 22.

Access is restricted to Compute Canada authenticated users only: Yes

This course's description will be posted soon. (It likely will involve how to use Graham's VDI nodes and an introduction to using Paraview.)

Live Session Dates: Feb. 28 & Mar. 3

Access is restricted to Compute Canada authenticated users only: Yes

Singularity is a container solution similar to Docker for secure use on multi-user high performance and advanced computing (HPC and AC) systems. Singularity has evolved significantly since it was first introduced in 2015 supporting features such as overlay files, image encryption, exposing GPGPU devices to the container software, etc. Features aside,  using containers is valuable for many: the installed software remains exactly the same and can be used a number of clusters and personal computers running Linux by everyone in a research team.

This course will cover commonly used aspects of Singularity, e.g., building an image from DockerHub, how to build a custom image of your own, how to use an overlay file to have read-write access to an image, and how to build and make use of Conda software installations inside an Singularity image.

NOTE: On Compute Canada's clusters, we recommend that Conda is installed and used completely within a Singularity container and we recommend not using Conda directly in your account, e.g., see this link. If you have a need or are using Conda, enrolling in this course will enable you to learn how you can use Conda inside of Singularity.

Live Session Dates: Mar. 7 & 10

Access is restricted to Compute Canada authenticated users only: Yes

Perhaps you have learnt you need to learn how to write some parallel code using Message Passing Interface (MPI) for your research? Don't worry, no prior MPI knowledge or experience is needed. MPI programs can be written in C, C++, or Fortran so (obviously, any one of) prior C, C++, and/or Fortran programming experience is needed.

This introductory MPI course will present an overview of what is MPI; what it enables one to do; demonstrate and discuss various MPI programs starting with a simple "Hello World!" program; demonstrate how to compile and run MPI programs; discuss some of the MPI communications API calls; and discuss about how to explore using MPI further beyond this course.

Live Session Dates: Mar. 14, 17, 21, 24.

Access is restricted to Compute Canada authenticated users only: Yes