This is a common area for all 2021 Summer School course registrants. Everyone that is registered in a Summer School 2021 course is automatically granted access to this common area.
The common area contains:
- a link to the Summer School 2021 Slack used during courses,
- an email link to submit a ticket to help@sharcnet.ca concerning Summer School 2021, and,
- a Summer School 2021 announcements forum.
- Staff: Marie-Helene Burle
- Staff: Tyler Collins
- Staff: James Desjardins
- Staff: Baolai Ge
- Staff: Jose Sergio Hleap
- Staff: Sergey Mashchenko
- Staff: Pawel Pomorski
- Staff: Paul Preney
- Staff: Armin Sobhani
- Staff: Tyson Whitehead
- Staff: Isaac Ye
Description: This course features two sessions, focussing on first principles when interacting with the shell, and job submission through the slurm scheduler. The first day will be dedicated to the shell, and cover topics such as: starting a command-line session, moving around and looking at things, accessing software, and more. The second session will introduce the concepts necessary to understanding job scheduling in an HPC environment. By the end of both sessions, attendees will be able to connect to a Canadian supercomputer, create or upload their own scripts, and submit them to the scheduler.
- Teacher: Tyler Collins
- Teacher: James Desjardins
- Teacher: Paul Preney
- Teacher: Tyson Whitehead
Description: Writing code using modern C++ not only results in shorter, faster, and more robust code, it is also easier to maintain and update. Writing parallel code using modern C++ should result in straight-forward, sane-to-maintain/debug code. This course will cover what comprises modern C++ programming and will then present and discuss how to use modern C++ to write parallel code.
Length: 3 days
- Teacher: Paul Preney
Description: This course is in response to the inquiry from the research community: Is FORTRAN (still) supported? While to many FORTRAN sounds like Latin, with its glory in the past, an ancient, dead language today, the reality is quite the opposite. Fortran, as the way it is spelled today, is still alive, still used by many people, after undergoing significant modifications and improvement in the past two decades that are rarely seen in any other languages.
Since the introduction of the milestone of Fortran 90 standard, Fortran has become very different from its ancestor with many new modern language features. Inherited from old FORTRAN standards and enhanced by the newer ones, Fortran is still the language of choice for scientific computing for its intrinsic support for real and complex floating point operations. Its array handling, enhanced by index slicing, makes array processing similar or identical to its counterparts, such as MATLAB, hence superior to any other programming languages. The unified interface of intrinsic functions for all data types makes translating mathematical expressions to computer programmes a lot cleaner and easier. Further, since the standard 2008, the introduction of co-arrays has made writing parallel code that can run on multiple core computers and clusters trivial, one needs not to know the traditional message passing interface (MPI) library, nor threading. They are built into the syntax of the language itself.
This course is a companion to the Modern C++ course, where you will find the features missing from Fortran that are more essential to general purpose computing than number crunching. We will show some of the practice of interlanguage programming, through which the shortcomings of Fortran could be complemented by interfacing with C/C++.
Length: 3 days
Activities: In-class exercises and homework.
Prerequisite knowledge: Some experience with scientific computing in any language.
- Teacher: Baolai Ge
Description: Python has become one of the most popular programming languages in scientific computing. It is high level enough that learning it is easy, and coding with it is significantly faster than other programming languages. However, the performance of pure Python programs is often sub-optimal, and might hinder your research. In this course, we will show you some ways to identify performance bottlenecks, improve slow blocks of code, and extend Python with compiled code. You’ll learn various ways to optimise and parallelise Python programs, particularly in the context of scientific and high performance computing.
Length: 3 days
Prerequisite knowledge:
- Knowing what is classes and functions are.
- Familiarity with Jupyter Notebook and basic console use.
- Comfortable with the python software carpentry material.
- Teacher: Tyler Collins
- Teacher: Jose Sergio Hleap
- Teacher: Pawel Pomorski
Description: This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. The course covers some new features available on GPUs installed on Graham and Cedar. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications.
Length: 3 days
Prerequisite knowledge: C/C++ scientific programming, experience editing and compiling code in a Linux environment. Some experience with CUDA and/or OpenMP a plus.
- Teacher: Sergey Mashchenko
- Teacher: Pawel Pomorski
Description: MPI is a standardized and portable message-passing interface for parallel computing clusters. The standard defines the syntax and semantics of a core of library routines useful to a wide range of scientific programs in C/C++ and Fortran. MPI's goals are high performance, scalability, and portability. It is the dominant parallel programming model used in high-performance-computing today. Through lectures interspersed with hands-on labs, the students will learn how to programming with MPI. Examples and exercises will be based on parallelization of common computing problems.
Length: 3 days
Prerequisite knowledge: Basic C/C++ and/or Fortran knowledge; experience editing and compiling code in a Linux environment.
- Teacher: Baolai Ge
- Teacher: Paul Preney
Description: Deep learning has been getting lots of attention from many different fields including business/finance, image handling and even science simulation. 2-Day Deep learning foundation course is recommended for who wants to actually program it. The course will proceed with presentation and live demonstration using Google Colab and eventually will help the student to run his/her code on Graham cluster in Compute Canada. Day 1 will cover Multivariable linear regression and MNIST image classification using linear and Multi-Layer Perceptron (MLP) including some techniques to avoid overfitting. Day 2 will cover CIFAR10 image classification with Convolutional Neural Network and a simple example with Recurrent Neural Network. All coding will be done using Google Colab in Python 3 and PyTorch 1.8.
Length: 2 days
Prerequisite knowledge: Basic knowledge on Python and object oriented programming.
- Teacher: Isaac Ye
Description: Bioinformatics is a multidisciplinary field that develops methods for turning the so-called biological big data into knowledge. Although most problems in biology are not as embarrassingly parallelizable as the physics codes that HPC systems are usually designed for, this has been starting to change in recent years. Metagenomics is an emerging field in bioinformatics that investigates microbes inhabiting oceans, soils, human body, etc. with sequencing technologies. In this two-day online hands-on session, a typical metagenomics pipeline will be explored to introduce common tools used in bioinformatics analysis and how they can be utilized in an HPC environment.
Length: 2 days
Prerequisite knowledge: Basic Unix competence and basic knowledge of bioinformatic applications are required.
- Teacher: Jose Sergio Hleap
- Teacher: Armin Sobhani
Description: This course will introduce the tools for debugging and profiling parallel programs available in SHARCNET. Through a series of hands-on exercises we will learn to find and fix common bugs in MPI, OpenMP, and CUDA programs, and improve efficiency of a parallel code.
Length: 2 days
Prerequisite knowledge: Basic knowledge of one or more parallel programming platforms (MPI, OpenMP, and/or CUDA).
- Teacher: Sergey Mashchenko
Description: We begin with loading a dataset in a CSV file into R and doing some basic statistics with the dataset. We then look into how to group those R commands into an R function to avoid typing commands repeatedly when we need to redo the calculations. We move on to show how to automate the process of loading a bunch of CSV files and perform the same statistical tasks with R built-in commands and for loops. Naturally we start looking into the fundamental concepts of R programming, with focus on manipulation of lists, vectors, arrays and data frames. As an important feature and strength in statistical computing, we give a very brief introduction to the random number generators and statistical functions. We touch upon linear regression with simple, easy to understand examples. At the end of the course, we revisit the data frame that we begin with and cover more topics on data frame.
Length: 3 days
Activities: Quizzes, homework.
Prerequisites: None.
- Teacher: Marie-Helene Burle
- Teacher: Baolai Ge
- Teacher: Tyson Whitehead