This is an old revision of the document!
Table of Contents
ÚFAL Grid Engine (LRC)
LRC (Linguistic Research Cluster) is the name of ÚFAL's computational grid/cluster. The cluster is built on top of SLURM and is using Lustre for data storage.
Currently there are following partitions (queues) available for computing:
Node list by partitions
The naming convention is straightforward for CPU nodes - nodes in each group are numbered. For GPU nodes the format is: [t]dll-XgpuN where X gives total number of GPUs equipped and N is just enumerating the order of the node with the given configuration.
The prefix t is for nodes at Troja and dll stands for Deep Learning Laboratory.
cpu-troja
Node name | Thread count | Socket:Core:Thread | RAM (MB) |
achilles[1-8] | 32 | 2:8:2 | 128810 |
hector[1-8] | 32 | 2:8:2 | 128810 |
helena[1-8] | 32 | 2:8:2 | 128811 |
paris[1-8] | 32 | 2:8:2 | 128810 |
hyperion[2-8] | 64 | 2:16:2 | 257667 |
cpu-ms
Node name | Thread count | Socket:Core:Thread | RAM (MB) |
iridium | 16 | 2:4:2 | 515977 |
orion[1-8] | 40 | 2:10:2 | 128799 |
gpu-troja
Node name | Thread count | Socket:Core:Thread | RAM (MB) | Features | GPU type |
tdll-3gpu[1-4] | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
tdll-8gpu[1,2] | 64 | 2:16:2 | 257666 | gpuram40G gpu_cc8.0 | NVIDIA A100 |
tdll-8gpu[3-7] | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
gpu-ms
Node name | Thread count | Socket:Core:Thread | RAM (MB) | Features | GPU type |
dll-3gpu[1-5] | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
dll-4gpu[1,2] | 40 | 2:10:2 | 187978 | gpuram24G gpu_cc8.6 | NVIDIA RTX 3090 |
dll-8gpu[1,2] | 64 | 2:16:2 | 515838 | gpuram24G gpu_cc8.0 | NVIDIA A30 |
dll-8gpu[3,4] | 32 | 2:8:2 | 257830 | gpuram16G gpu_cc8.6 | NVIDIA RTX A4000 |
dll-8gpu[5,6] | 40 | 2:10:2 | 385595 | gpuram16G gpu_cc7.5 | NVIDIA Quadro RTX 5000 |
dll-10gpu1 | 32 | 2:8:2 | 257830 | gpuram16G gpu_cc8.6 | NVIDIA RTX A4000 |
dll-10gpu[2,3] | 32 | 2:8:2 | 257830 | gpuram11G gpu_cc6.1 | NVIDIA GeForce GTX 1080 Ti |
Submit nodes
In order to submit a job you need to login to one of the head nodes:
lrc1.ufal.hide.ms.mff.cuni.cz lrc2.ufal.hide.ms.mff.cuni.cz sol1.ufal.hide.ms.mff.cuni.cz sol2.ufal.hide.ms.mff.cuni.cz sol3.ufal.hide.ms.mff.cuni.cz sol4.ufal.hide.ms.mff.cuni.cz
Basic usage
Batch mode
The core idea is that you write a batch script containing the commands you wish to run as well as a list of SBATCH
directives specifying the resources or parameters that you need for your job.
Then the script is submitted to the cluster with:
sbatch myJobScript.sh
Here is a simple working example:
#!/bin/bash #SBATCH -J helloWorld # name of job #SBATCH -p cpu-troja # name of partition or queue (default=cpu-troja) #SBATCH -o helloWorld.out # name of output file for this submission script #SBATCH -e helloWorld.err # name of error file for this submission script # run my job (some executable) sleep 5 echo "Hello I am running on cluster!"
After submitting this simple code you should end up with the two files (helloWorld.out
and helloWorld.err
) in the directory where you called the sbatch
command.
Here is the list of other useful SBATCH
directives:
#SBATCH -D /some/path/ # change directory before executing the job #SBATCH -N 2 # number of nodes (default 1) #SBATCH --nodelist=node1,node2... # execute on *all* the specified nodes (and possibly more) #SBATCH --cpus-per-task=4 # number of cores/threads per task (default 1) #SBATCH --gres=gpu:1 # number of GPUs to request (default 0) #SBATCH --mem=10G # request 10 gigabytes memory (per node, default depends on node)
If you need you can have slurm report to you:
#SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-type=fail # send email if job fails #SBATCH --mail-user=<YourUFALEmailAccount>
As usuall the complete set of options can be found by typing:
man sbatch
Rudolf's template
The main point is for log files to have the job name and job id in them automatically.
#SBATCH -J RuRjob #SBATCH -o %x.%j.out #SBATCH -e %x.%j.err #SBATCH -p gpu-troja #SBATCH --gres=gpu:1 #SBATCH --mem=16G #SBATCH --constraint="gpuram16G|gpuram24G" # Print each command to STDERR before executing (expanded), prefixed by "+ " set -o xtrace
Running jobs
In order to inspect all running jobs on the cluster use:
squeue
filter only jobs of user linguist
:
squeue -u linguist
filter only jobs on partition gpu-ms
:
squeue -p gpu-ms
filter jobs in specific state (see man squeue
for list of valid job states):
squeue -t RUNNING
filter jobs running on a specific node:
squeue -w dll-3gpu1
Cluster info
The command sinfo
can give you useful information about nodes available in the cluster. Here is a short list of some examples:
List available partitions(queues). The default partition is marked with *
:
sinfo
List detailed info about nodes:
sinfo -l -N
List nodes with some custom format info:
sinfo -N -o "%N %P %.11T %.15f"
CPU core allocation
The minimal computing resource in SLURM is one CPU core. However, CPU count advertised by SLURM corresponds to the number of CPU threads.
If you ask for 1 CPU core with
--cpus-per-task=1
SLURM will allocate all threads of 1 CPU core.
For example dll-8gpu1
will allocate 2 threads since its ThreadsPerCore=2:
$> scontrol show node dll-8gpu1 $ scontrol show node dll-8gpu1 NodeName=dll-8gpu1 Arch=x86_64 CoresPerSocket=16 CPUAlloc=0 CPUTot=64 CPULoad=0.05 // CPUAlloc - allocated threads, CPUTot - total threads AvailableFeatures=gpuram24G ActiveFeatures=gpuram24G Gres=gpu:nvidia_a30:8(S:0-1) NodeAddr=10.10.24.63 NodeHostName=dll-8gpu1 Version=21.08.8-2 OS=Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3 (Wed, 11 May 2022 07:57:51 +0200) RealMemory=515838 AllocMem=0 FreeMem=507650 Sockets=2 Boards=1 CoreSpecCount=1 CPUSpecList=62-63 // CoreSpecCount - cores reserved for OS, CPUSpecList - list of threads reserved for system State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A // ThreadsPerCore - count of threads for 1 CPU core Partitions=gpu-ms BootTime=2022-09-01T14:07:50 SlurmdStartTime=2022-09-02T13:54:05 LastBusyTime=2022-10-02T20:17:09 CfgTRES=cpu=64,mem=515838M,billing=64 AllocTRES= CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
In the example above you can see comments at all lines relevant to CPU allocation.
Interactive mode
This mode can be useful for testing You should be using batch mode for any serious computation.
You can use srun
command to get an interactive shell on an arbitrary node from the default partition (queue):
srun --pty bash
There are many more parameters available to use. For example:
To get an interactive CPU job with 64GB of reserved memory:
srun -p cpu-troja,cpu-ms --mem=64G --pty bash
-p cpu-troja
explicitly requires partitioncpu-troja
. If not specified slurm will use default partition.-
-mem=64G
requires 64G of memory for the job
To get interactive job with a single GPU of any kind:
srun -p gpu-troja,gpu-ms --gres=gpu:1 --pty bash
-p gpu-troja,gpu-ms
require only nodes from these two partitions-
-gres=gpu:1
requires 1 GPUs
srun -p gpu-troja,gpu-ms --nodelist=tdll-3gpu1 --mem=64G --gres=gpu:2 --pty bash
-p gpu-troja,gpu-ms
require only nodes from these two partitions-
-nodelist=tdll-3gpu1
explicitly requires one specific node- Note that e.g.
-
-nodelist=tdll-3gpu[1-4]
would execute 4 jobs on all the four machinestdll-3gpu[1-4]
. The documentation says “The job will contain all of these hosts and possibly additional hosts as needed to satisfy resource requirements.” I am not aware of any simple way how to specify that any of the listed nodes can be used, i.e. an equivalent of SGE-q '*@hector[14]
'. -
-gres=gpu:2
requires 2 GPUs
srun -p gpu-troja --constraint="gpuram48G|gpuram40G" --mem=64G --gres=gpu:2 --pty bash
-
-constraint=“gpuram48G|gpuram40G”
only consider nodes that have eithergpuram48G
orgpuram40G
feature defined
Delete Job
scancel <job_id>
To see all the available options type:
man srun
Basic commands on cluster machines
lspci # is any such hardware there? nvidia-smi # more details, incl. running processes on the GPU # nvidia-* are typically located in /usr/bin watch nvidia-smi # For monitoring GPU activity in a separate terminal (thanks to Jindrich Libovicky for this!) # You can also use nvidia-smi -l TIME nvcc --version # this should tell CUDA version # nvcc is typically installed in /usr/local/cuda/bin/ theano-test # dela to vubec neco uzitecneho? :-) # theano-* are typically located in /usr/local/bin/ /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery # shows CUDA capability etc. ssh dll1; ~popel/bin/gpu_allocations # who occupies which card on a given machine