[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
slurm [2023/04/13 17:01]
dusek Priority
slurm [2023/09/26 17:09]
straka
Line 125: Line 125:
 </code> </code>
  
-==== Running jobs ====+==== Inspecting jobs ====
  
 In order to inspect all running jobs on the cluster use: In order to inspect all running jobs on the cluster use:
Line 208: Line 208:
 === Priority ==== === Priority ====
  
-When running srun or sbatch, you can pass `-q high/normal/low/preempt-low`. These represent priorities 300/200/100/100, with `normal(200) being the default. Furthermore, the `preempt-lowQOS is actually preemptible -- if there is a job with normal or high QOS, they can interrupt your `preempt-lowjob.+When running srun or sbatch, you can pass ''-q high/normal/low/preempt-low''. These represent priorities 300/200/100/100, with ''normal'' (200) being the default. Furthermore, the ''preempt-low'' QOS is actually preemptible -- if there is a job with normal or high QOS, they can interrupt your ''preempt-low'' job.
  
 The preemption has probably not been used by anyone yet; some documentation about it is on https://slurm.schedmd.com/preempt.html, we use the REQUEUE regime (so your job is killed, very likely with some signal, so you could monitor it and for example save a checkpoint; but currently I do not know any details), and then started again when there are resources. The preemption has probably not been used by anyone yet; some documentation about it is on https://slurm.schedmd.com/preempt.html, we use the REQUEUE regime (so your job is killed, very likely with some signal, so you could monitor it and for example save a checkpoint; but currently I do not know any details), and then started again when there are resources.
Line 240: Line 240:
 <code>srun -p gpu-troja --constraint="gpuram48G|gpuram40G" --mem=64G --gres=gpu:2 --pty bash</code> <code>srun -p gpu-troja --constraint="gpuram48G|gpuram40G" --mem=64G --gres=gpu:2 --pty bash</code>
   * ''-''''-constraint="gpuram48G|gpuram40G"'' only consider nodes that have either ''gpuram48G'' or ''gpuram40G'' feature defined   * ''-''''-constraint="gpuram48G|gpuram40G"'' only consider nodes that have either ''gpuram48G'' or ''gpuram40G'' feature defined
 +
 +==== ====
  
 ==== Delete Job ==== ==== Delete Job ====
 <code>scancel <job_id> </code> <code>scancel <job_id> </code>
 +
 +<code>scancel -n <job_name> </code>
 +
  
 To see all the available options type: To see all the available options type:
  
-<code>man srun</code>+<code>man scancel</code>
  
 ==== Basic commands on cluster machines ==== ==== Basic commands on cluster machines ====

[ Back to the navigation ] [ Back to the content ]