ÚFAL Grid Engine (LRC)

LRC (Linguistic Research Cluster) is a name of ÚFAL's computational grid/cluster.

Basic usage

Batch mode

The core idea is that you write a batch script containing the commands you wish to run as well as a list of SBATCH directives specifying the resources or parameters that you need for your job.
Then the script is submitted to the cluster with:

sbatch myJobScript.sh

Here is a simple working example:

#!/bin/bash
#SBATCH -J helloWorld					  # name of job
#SBATCH -p cpu-troja					  # name of partition or queue
#SBATCH -o helloWorld.out				  # name of output file for this submission script
#SBATCH -e helloWorld.err				  # name of error file for this submission script

# run my job (some executable)
sleep 5
echo "Hello I am running on cluster!"

After submitting this simple code you should end up with the two files (helloWorld.out and helloWorld.err) in the directory where you called the sbatch command.

Here is the list of other useful SBATCH directives:

#SBATCH -D /some/path/                        # change directory before executing the job   
#SBATCH -N 2                                  # number of nodes (default 1)
#SBATCH --nodelist=node1,node2...             # required node, or comma separated list of required nodes
#SBATCH -c 4                                  # number of cores/threads per task (default 1)
#SBATCH --gres=gpu:1                          # number of GPUs to request (default 0)
#SBATCH --mem=10G                             # request 10 gigabytes memory (per node, default depends on node)

If you need you can have slurm report to you:

#SBATCH --mail-type=begin        # send email when job begins
#SBATCH --mail-type=end          # send email when job ends
#SBATCH --mail-type=fail         # send email if job fails
#SBATCH --mail-user=<YourUFALEmailAccount>

As usuall the complete set of options can be found by typing:

man sbatch

Running jobs

In order to inspect all running jobs on the cluster use:

squeue

filter only jobs of user linguist:

squeue -u linguist

filter only jobs on partition gpu-ms:

squeue -p gpu-ms

filter jobs in specific state (see man squeue for list of valid job states):

squeue -t RUNNING

filter jobs running on a specific node:

squeue -w dll-3gpu1

Cluster info

The command sinfo can give you useful information about nodes available in the cluster. Here is a short list of some examples:

List available partitions(queues). The default partition is marked with *:

sinfo

List types of available GPUs:

sinfo -o %G

Interactive mode

This mode can be useful for testing You should be using batch mode for any serious computation.
You can use srun command to get an interactive shell on an arbitrary node from the default partition (queue):

srun --pty bash

There are many more parameters available to use. For example:

srun -p cpu-troja --mem=64G --pty bash

-p cpu-troja explicitly requires partition cpu-troja
–mem=64G requires 64G of memory for the job

srun -p gpu-troja --nodelist=tdll-3gpu1 --mem=64G --gres=gpu:2 --pty bash

–nodelist=tdll-3gpu1 explicitly requires one specific node
–gres=gpu:2 requires 2 GPUs

To see all the available options type:

man srun

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Table of Contents