This page is no longer maintained.

All information and guides related to AI-LAB have been moved to https://hpc.aau.dk/ai-lab/ . Please visit the new site for the most up-to-date resources on AI-LAB.

Skip to content

CPU, GPU, and memory allocation

To effectively run jobs, it's important to understand the hardware configuration and set appropriate parameters for resource allocation. Here’s a detailed guide on setting Slurm parameters based on the specified hardware on the platform:

Memory per job

CPUs per task

GPUs per job

Request only the number of GPUs your job can effectively utilize. Over-requesting can lead to resource underutilization and longer queue times. Some applications may need adjustments to scale effectively across multiple GPUs. Here is an example of a PyTorch script that can handle multiple GPUs.

Monitor GPU usage

You can use the NVIDIA GPU monitoring command nvidia-smi to output GPU usage information. Learn how to use it in this guide.

Number of tasks to be run

--ntasks specifies the number of tasks to be run. Each task typically corresponds to an independent execution of your program or script. If your job can be parallelized across multiple tasks, set the number of tasks to e.g. --ntasks=4 for running 4 parallel tasks. Each task gets its allocation of resources (CPU, memory, etc.) based on other parameters like --cpus-per-task, --mem, and --gres=gpu.