The 'Linux Container' workshop at the ISC 2019 is called: 5th Annual High Performance Container Workshop

It is going to take place as part of the International Supercomputing Conference in Frankfurt on June 20nd from 9AM to 6PM at the Marriott Hotel.

Abstract

Linux Containers continue to gain momentum within data centers all over the world. They are able to benefit legacy infrastructures by leveraging the lower overhead compared to traditional, hypervisor-based virtualization. But there is more to Linux Containers, which this workshop will explore. Their portability, reproducibility and distribution capabilities outclass all prior technologies and disrupt former monolithic architectures, due to sub-second life cycles and self-service provisioning.

This workshop will outline the current state of Linux Containers in HPC/AI, what challenges are hindering the adoption in HPC/BigData and how containers can foster improvements when applied to the field of HPC, Big Data and AI in the mid- and long-term. By dissecting the different layers within the container ecosystem (runtime, supervision, engine, orchestration, distribution, security, scalability) this workshop will provide a holistic and a state-of-the-container overview, so that participants can make informed discussions on how to start, improve or continue their container adoption.

(Draft) Agenda

The first half of the day will be spend with introducing the speakers, provide an overview and discuss the topics which are not exclusivly HPC specific, but are fundamentals that are also important in non-HPC use cases: Which runtime fits my use-case? How to build my container image? How to distribute the artefacts? Depending on my use-case, dicipline, vertical - what should I focus on and what is less important?

Intro (09:00 - 10:00)

# Start Title Speaker Company
0 09:00 Welcome Christian Kniep QNIB Solutions
1 09:05 Intro UberCloud Burak Yenier UberCloud
2 09:10 Intro NVIDIA CJ Newburn NVIDIA
3 09:10 Intro Sylabs Eduardo/Michael Sylabs
4 09:20 Intro AWS Arthur Petitpierre AWS
5 09:25 Intro Mellanox Dror Goldenberg Mellanox
6 09:30 Intro RedHat Valentin Rothberg RedHat
7 09:35
Workshop Overview, Segments and Personas
Besides describing the workshop 'Personas' are introduces, which will attend the panel discussion with a narrow view of a particular use case in mind (SME, Large/Small Academia & Research Sites, Ops, Infrastructure).
Christian Kniep QNIB Solutions

Runtime (10:00 - 11:00)

Start Title Speaker Company
0 10:00 Introduction and Scope Christian Kniep
1 10:05 Current State of root-less dockerd Akihiro Suda
2 10:10 The podman runtime Valentin Rothberg
3 10:15 The Singularity runtime Eduardo/Michael
4 10:20 The SARUS runtime Lucas Benedicic
10:25 PANEL: How much namespaces do we need?
PANEL: Trust - How to validate what is started?
PANEL: Runtime hooks: Cure or curse?
PANEL: CVE 2019-5736: Container runtime breakout
PANEL: Q&A
11:00 Coffee Break

Build (11:30 - 12:20)

Start Title Speaker Company
0 11:30
Introduction and Scope
How we build (CI/CD vs interactive) and why we do so. Portability, reproducibility (CI/CD) vs optimization (interactive).
Christian Kniep
1 11:35 Rootless build with BuildKit Akihiro Suda
2 11:40 Buildah, a tool that facilitates building OCI images Valentin Rothberg
3 11:45 Singularity build Eduardo/Michael
4 11:50
Optimize for hardware again!
By adopting containers using the kernel as abstraction, images need to be compatible with all target systems. HW optimization - key to performance are hard to come by. This talk will explain how to craft Dockerfiles and build processes to allow for that again.
Christian Kniep
5 11:55 Tools: NVIDIA HPC Container Maker CJ Newburn
6 12:00 Build Tools like SPACK/EasyBuild Massimiliano Culpo
12:05 Panel: Q&A

Distribute (12:20 - 13:00)

Start Title Speaker Company
0 12:20
Introduction and Scope
The audience should get the gist, that distribution is meant to provide a scalable, reliable transport to ship the application. A challenge for the runtime is how to reuse images and containerFS within a clustered setting.
Christian Kniep
1 12:25
OCI Image Spec
Principles behind the OCI Image Spec and how it is leveraged.
Akihiro Suda
2 12:30 Singularity Image Format Eduardo/Michael
3 12:35 Skopeo Distribution Tool Valentin Rothberg
4 12:40 Hardware Optimized Images via MetaHub Registry Proxy Christian Kniep
12:45
PANEL: How to optimise storage for large scale cluster?
When running an (OCI) image on a large amount of nodes, each node downloads the image and create a snapshot to start the container in.
HPC runtimes tend to create a snapshot that resides on a shared file-system. This slot will discuss the benefits and drawbacks.
PANEL: Q&A
13:00 Lunch Break

Orchestration/Scheduling (14:00 - 15:15)

Start Title Speaker Company
0 14:00 Introduction and Scope Christian Kniep
1 14:05
Simple Orchestration with SWARM
The most simple orchestration out there is most likely SWARM. It has a simple model that explains what needs to be done to run container in a clustered environment. SWARM can be seen as a simple example of scheduling with the developer in mind.
Abdulrahman Azab
2 14:10
Recap on Kubernetes
After having a brief intro to orchestration via SWARM this slot will briefly explain how Kubernetes extends this to provide a more resilient and extendable system.
Daniel Gruber
3 14:15
Lustre within Kubernetes
Extending the Kubernetes intro even further; Arthur will explain how AWS puts Lustre within Kubernetes and make it scale.
Arthur Petitpierre
4 14:20
Using K8s operators for containerized RDMA workloads
RDMA is well-known high-performance networking interface for low latency, low overhead communications. RDMA accelerated Kubernetes clusters are set using standard device plugin and CNI interface for InfiniBand or Ethernet. Compute nodes join Kubernetes cluster dynamically. It is desired to advance the user experience for automated configuration and deployment. In this talk we will discuss how Kubernetes operators help to automate, deploy and upgrade infrastructure software components for faster node availability.
Dror Goldenberg
5 14:30 Slurm Operator for Kubernetes Eduardo/Michael
6 14:35 Nextflow to model (bioinformatic) workloads Paolo Di Tommaso
7 14:40 AWS Batch Arthur Petitpierre
14:45 PANEL: Q&A

Infrastructure (15:15 - 15:30)

Start Title Speaker Company
0 15:15 Introduction and Scope Christian Kniep
1 15:20 OpenStack Update and Direction Martial Michel
2 15:25 Dynamic HPC in a cloud environment. Arthur Petitpierre

HPC Specific / Distributed Workloads (15:30 - 16:00)

Start Title Speaker Company
0 15:35 Introduction and Scope Christian Kniep
1 15:40 How AWS blends fast POSIX (Lustre) and object stores (S3) Arthur Petitpierre
2 15:45 RDMA Device Isolation Dror Goldberg
3 15:50 PANEL: MPI Trends (init, orchestration)
16:00 Coffee Break

Use-Cases/Conclusions/Discussion (16:30 - 18:00)

Start Title Speaker Company
1 17:00
RDMA-GPU use-case
Heterogeneous cluster architectures are being used for HPC, data science, scientific and ML/DL/AI and other applications. Such platforms leverage high speed, low latency and smart interconnects to work optimally. RDMA has been a de-facto networking technology along with GPUDirect to accelerates CPU to CPU, CPU to GPU and GPU to GPU communications. When such applications are containerized, it poses challenges on configuring, deploying and orchestrating the system devices. In this session, we will discuss the challenges, how to enable containerized application using GPUDirect and RDMA in a Kubernetes cluster.
Dror Goldenberg
2 17:05 Mellanox Containerization Journey Dror Goldenberg
3 17:10 Looking back on 5y of containerization Burak Yenier
4 17:20 NERSC: Looking back Shane Canon
5 17:30 NVIDIAs journey with Containers CJ Newburn
17:40 PANEL: What did we miss
PANEL: Community to tap into!
PANEL: The good, the bad, the missing
PANEL: Bring the Glaskugel - what does a system look like in 2021?
PANEL: Q&A
18:00 Workshop Ending

Previous ISC Workshops

General Chair

Christian Kniep, Docker Inc.

Program and Publications Chairs:

  • Abdulrahman Azab, University of Oslo & PRACE
  • Shane Canon, NERSC

Participation

We encourage everyone to reach out and suggest content.

The Call For Paper can be found here