Skip to main content

Cotainr tool should make it easier to use LUMI's superpowers for research

DeiC's HPC team has developed a container software tool that initially aims to make it more accessible for those running AI/ML in Python to use the LUMI supercomputer.
By
20/09/2023 09:09
Billede
Containers on green background
Kilde:Colourbox

If you've ever tried to work on an HPC facility as a researcher, you know that it can be quite a task just to get started. Suddenly, you need to be familiar with a Linux setup, command lines, package managers, etc., to a degree you may not have experienced before. For many, this can entail a lot of extra work because it's not something most people are already well-versed in, even if they are experienced in the software development process.

In DeiC, we want to address this issue because it's important that as many people as possible can access our HPC facilities, and IT challenges shouldn't be what determines access to supercomputer resources. That's why, at DeiC, we have developed a piece of software called "Cotainr." The Cotainr tool ensures that you can run directly on the HPC facility via a single file, eliminating the need to download and install software on your laptop.

"At its core, it's about the challenges of making your software available on HPC facilities. You have to download and install software on your own computer. But there's an alternative, which involves using these containers, where you don't install the software, but instead, the software you need is in a file called a 'container.' You then use a special program on the HPC facility to execute the software directly from that file, without having to install all the software locally," explains Christian Schou Oxvig, Special Consultant at DeiC.

We want to help researchers get started more easily in using our HPC facilities, especially the LUMI supercomputer. In simple terms, this means we want to move away from the idea that you have to acquire a lot of detailed knowledge about Linux, supercomputers, etc., just to use our facilities. The goal is that you can do something that closely resembles what you already know as a researcher.

The first use case is Python environments with AI and ML on LUMI

"Initially, we have developed a Cotainr solution for LUMI, so you can make your Python/AI/ML software easily accessible. That's where we start because we know there can be challenges here," says Christian Schou Oxvig, Special Consultant at DeiC. "If you work in Python, for example, with AI/ML, data science, or SA analysis, and you already use the tools called Conda and Pip to manage your Python environments, you will be able to quickly transition to LUMI and run from our Cotainr," Christian continues.

Kaare Mikkelsen from AU will use Cotainr with Pytorch for training deep learning models on LUMI

One of the first people to start using the Cotainr solution on LUMI is Adjunct Kaare Mikkelsen from AU, who will use Cotainr in his teaching and in connection with some bachelor and master's projects. When we showed him what it takes to access LUMI's superpowers without Cotainr and then how he can access LUMI with Cotainr, he was very excited. Kaare works with AI/ML in a Python framework called Pytorch, and he wants to work in this environment on LUMI's GPUs. So, the key is to make the Pytorch software compatible with LUMI, and that's precisely what Cotainr helps him do. He can now easily set up the specific software environment he needs with Cotainr, taking his existing setup, Python packages, and libraries he uses on his laptop, describing them in a file, and then going to LUMI and saying, "Create this environment for me." Kaare Mikkelsen can then use that environment to train his deep learning neural networks on LUMI's GPUs. Additionally, he can quickly create various environments tailored to individual projects.

Cotainr tool is part of a larger vision in the HPC team

We really want to promote the use of HPC facilities, both in Denmark and internationally. This is part of a larger vision we have in the HPC team. With the model we have, where users are allocated computing time on various facilities, it's important that users can easily move their code and simulations from one facility to another. We provide capacity on many different facilities, so ensuring that users can easily take the code they're working on and the simulations they're running and move them to another facility is crucial.

"For now, we're focusing on getting Cotainr out on LUMI, and as we see the need, more use cases will emerge. Our clear goal is to spread the idea to as many systems as possible to simplify access to our HPC facilities," says Eske Christiansen, Chief of HPC at DeiC.

We understand that it's a significant task, but we believe it's essential. That's why we at DeiC are taking the lead in this effort, but anyone can contribute because the solution is free, open source, and available on GitHub.

More information

• Contact: Eske Christiansen, eske.christiansen@deic.dk

• Code on GitHub: https://github.com/DeiC-HPC/cotainr

• Documentation on Read the Docs: https://cotainr.readthedocs.io