SHARCNET Summer School 2020

The SHARCNET Summer School on Advanced Research Computing (ARC) is an annual educational event for graduate/undergraduate students, postdocs and researchers who are engaged in compute-intensive research. These schools are FREE and provide beginner-to-intermediate level courses on a wide range of subjects, from traditional high performance computing to data science, machine learning, and popular languages (Python, Julia etc.); they also include domain-specific courses.

In past years, this event was run in person (5 days, three parallel streams) on one of SHARCNET campuses, but this year it will be run entirely online, due to COVID-19 pandemic. It will start on May 19 as a single stream event and will last until mid-June. The courses are between 1 and 3 days long, and consist of live Zoom lecturing sessions interleaved with live Zoom hands-on sessions. The classes start at 9:00am EDT, and end at 5:00pm EDT (with multiple breaks).

Preparing for Summer School

  • To take a full advantage of the courses, you will need a computer (PC or laptop), and reasonably good internet connection (ideally, 10 mbit/s or better).
  • You will also need a headset with (optionally) a microphone. Please test the headset before the school.
  • We will be using Zoom for both lecturing and hands on exercises. Zoom can be used from a browser, but it usually works better when installed as an app. Please create a Zoom account and test Zoom app before the summer school.
Summer School Registration
Enrolling in a Course
Unenrolling in a Course

To enroll in any of the courses listed below, do the following:

  • click the name of the course,
  • scroll down to the bottom of the page for that course, click the "Enroll me" button.

If the course is full, you may choose to add yourself to the waiting list. When someone drops from the course and a vacancy becomes available, you will be added to the course and will receive a notification by email.

To unenroll yourself from a course you registered in, do the following:

  • Go to the course's main page that you are registered in by clicking the course's link.
  • Click the "gear" icon at the top-right hand side of the main course screen.
  • Click the "Unenroll" link.

This is a common area for all 2020 Summer School course registrants. Everyone that is registered in a Summer School 2020 course is automatically granted access to this common area.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

This course is a series of short live-demo sessions interwoven with hands-on exercises. The goal is to give you the skills required to use Canada's research High-Performance Computing (HPC) resources.

Specifics topics that will be covered (time allowing) are:

  • finding information on Canadian supercomputers
  • getting an account
  • transferring files
  • starting a command-line session
  • moving around and looking at things
  • accessing software
  • scheduling programs to run
  • reading and writing files
  • using wildcards and pipes
  • automating tasks with loops
  • writing scripts (collections of commands)
  • searching for things
  • writing and using regular expressions

Unlike a graphical user-interface, the command line is well suited for automation, so these skills are also useful outside of HPC.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

This course is a series of short live-demo sessions interwoven with hands-on exercises. The goal is to give you the skills required to use Canada's research High-Performance Computing (HPC) resources.

Specifics topics that will be covered (time allowing) are:

  • finding information on Canadian supercomputers
  • getting an account
  • transferring files
  • starting a command-line session
  • moving around and looking at things
  • accessing software
  • scheduling programs to run
  • reading and writing files
  • using wildcards and pipes
  • automating tasks with loops
  • writing scripts (collections of commands)
  • searching for things
  • writing and using regular expressions

Unlike a graphical user-interface, the command line is well suited for automation, so these skills are also useful outside of HPC.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

This two-day workshop will focus on understanding and using modern C++ concurrency features to facilitate writing parallel (threaded) code. This course will give an overview of the C++ memory models, parallel algorithms, synchronization, and other concurrency constructs with a strong focus on keeping code straight-forward to design, write, and reason about.

Target audience: Researchers who are or will be using C++.

Instructor: Paul Preney, SHARCNET / University of Windsor

Pre-requisite knowledge: The attendee is expected to know how to write serial procedural code in C/C++.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

Python has become one of the most popular programming languages in scientific computing. It is high level enough that learning it is easy, and coding with it is significantly faster than other programming languages. However, the performance of pure Python programs is often suboptimal, and might hinder your research. In this course, we will show you some ways to identify performance bottlenecks, improve slow blocks of code, and extend Python with compiled code. You’ll learn various ways to optimise and parallelise Python programs, particularly in the context of scientific and high performance computing.

REQUIREMENTS You must be comfortable with plain python. That includes: 1. Knowing what is a class and a function 2. Familiar with Jupyter notebooks and basic console 3. Comfortable with the python software carpentry material

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. The course covers some new features available on GPUs installed on Graham and Cedar. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications.

Prerequisites: C/C++ scientific programming, experience editing and compiling code in a Linux environment. Some experience with CUDA and/or OpenMP a plus.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

In this course, students will learn the high level, high performance language julia from the basics of programming features to advanced topics on parallel and distributed computing in several sessions.

Julia is getting increasingly popular for scientific computing. One may use it for prototyping as Matlab, R and Python for productivity, while gain the same performance as compiled languages such as C/C++ and Fortran. The language is designed for both prototyping and performance, as well as simplicity.

This is an introductory course on julia. Students will be able to get started quickly with the basics, in comparison with other similar languages such as Matlab, R, Python and Fortran and move on to learn how to write code that can run in parallel on multi-core and cluster systems through examples.

This course is delivered in several sessions through live classes via Zoom and self-paced learning. Students who missed some of the live classes at scheduled time slots may learn at their own pace. Students may ask questions and get answers in the forum for the course.

Students who complete the course activities including quiz and exercises will receive a certificate for the completion of the courses. Attendance and the completion of the course activities are recorded.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

This is a two-day course on Machine Learning. It consists of two major subjects: Machine Learning in Data Science and Deep Learning towards Artificial Intelligence. 

Day 1 will provide an introduction to Machine Learning in Data Analytics. Following an overview of Machine Learning in Data Science, we will learn from examples for a full cycle of Machine Learning with real data, including data cleansing, data exploration, and a typical Machine Learning workflow. We will use Pandas and Scikit-Learn APIs in Python with hands-on practices through Jupyter Notebook

Day 2 will provide an overview of Artificial Intelligence with a focus on Deep Learning (DL) and Deep Neural Networks (DNN). We will have hands-on tutorials on two major different DL frameworks - Tensorflow and PyTorch by running CIFAR-10 image classification example on Compute Canada system. The hands-on labs will focus on how to run a sample case with different hardware resources such as a single-GPU, multi-GPU and multi-GPU nodes while the basic Tensorflow/PyTorch programming is briefly introduced.

Instructors: Jinhui Qin, SHARCNET / Western University (day 1) and Isaac Ye, SHARCNET / York University (day 2)

Prerequisites: Some programming experience in Python and background in statistics would be helpful but not mandatory. 

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

MPI is a standardized and portable message-passing interface for parallel computing clusters. The standard defines the syntax and semantics of a core of library routines useful to a wide range of scientific programs in C/C++ and Fortran. MPI's goals are high performance, scalability, and portability. It is the dominant parallel programming model used in high-performance-computing today.

In this two-day session, through lectures interspersed with hands-on labs, the students will learn the basics of MPI programming. Examples and exercises will be based on parallelization of common computing problems.

Instructor: Jemmy Hu, SHARCNET, University of Waterloo, Ge Baolai, SHARCNET, Western University.

Prerequisites: Basic C/C++ and/or Fortran knowledge; experience editing and compiling code in a Linux environment.


Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

Bioinformatics is a multidisciplinary field that develops methods for turning the so-called biological big data into knowledge. Although most problems in biology are not as embarrassingly parallelizable as the physics codes that HPC systems are usually designed for, this has been starting to change in recent years. Metagenomics is an emerging field in bioinformatics that investigates microbes inhabiting oceans, soils, human body, etc. with sequencing technologies.

In this two-day hands-on session, a typical metagenomics pipeline will be explored to introduce common tools used in bioinformatics analysis and how they can be utilized in an HPC environment.

Instructors:

  • DAY ONE: Jose Sergio Hleap, SHARCNET, University of Guelph
  • DAY TWO: Armin Sobhani, SHARCNET, Ontario Tech University

Prerequisites: Basic Unix competence and basic knowledge of bioinformatics applications are required


Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

The majority of production work on the Compute Canada systems like Graham are dispatched to the compute nodes via the Slurm scheduler. The Compute Canada General Purpose (GP) systems (Beluga, Cedar, and Graham) are heterogeneous in that they have various node types with hardware options ranging in core count, memory availability, and GPUs, not to mention several model generations. The systems were also designed to accommodate a highly varied workload structures from the Canadian academic research community, including various job run times, CPU, GPU and memory sizes, as well as interactive workloads. The Slurm scheduler configuration is designed to maximize fairness, responsiveness and overall utilization. Beyond understanding how to request the appropriate resources required for a job, an understanding of the system specific configurations can have significant impact on the time to result on these systems by minimizing wait times in the queue. After covering job submission techniques this course provides information about monitoring jobs.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No

The focus of this course is on Graham usage other than batch computing via the scheduler. Although the majority of production work on the Compute Canada clusters is executed as unsupervised compute jobs dispatched via the Slurm scheduler, large scale research projects involve several other computational tasks that precede and follow the batch production work. Some of these tasks involve file management, software development and prototyping, performance profiling, data sharing and visualization. This course demonstrates common ways for accessing the cluster with interactive capabilities for achieving the various pre- and post-production tasks associated with computationally demanding research projects.

Access is restricted to Digital Research Alliance of Canada (formerly Compute Canada) authenticated users only: No