Differences

This shows you the differences between two versions of the page.

--- slurm [2022/10/17 11:41]
fucik [Interactive mode]
+++ slurm [2023/01/11 16:03]
krubinski [gpu-troja]
@@ Line 5: / Line 5: @@
 Currently there are following partitions (queues) available for computing:
-| **Partition name** | **Nodes**  | **Note** |
+===== Node list by partitions =====
-| cpu-troja      | 7x CPU | default partition |
-| gpu-troja      | 6x GPU | features: gpuram48G,gpuram40G |
+The naming convention is straightforward for CPU nodes - nodes in each group are numbered. For GPU nodes the format is: [t]dll-**X**gpu**N** where **X** gives total number of GPUs equipped and **N** is just enumerating the order of the node with the given configuration.
-| gpu-ms         | 7x GPU | features: gpuram48G,gpuram24G |
+The prefix **t** is for nodes at Troja and **dll** stands for Deep Learning Laboratory.
+==== cpu-troja ====
+| Node name | Thread count | Socket:Core:Thread | RAM (MB) |
+| achilles1 | 32 | 2:8:2 | 128810 |
+| achilles2 | 32 | 2:8:2 | 128810 |
+| achilles3 | 32 | 2:8:2 | 128810 |
+| achilles4 | 32 | 2:8:2 | 128810 |
+| achilles5 | 32 | 2:8:2 | 128810 |
+| achilles6 | 32 | 2:8:2 | 128810 |
+| achilles7 | 32 | 2:8:2 | 128810 |
+| achilles8 | 32 | 2:8:2 | 128810 |
+| hector1 | 32 | 2:8:2 | 128810 |
+| hector2 | 32 | 2:8:2 | 128810 |
+| hector3 | 32 | 2:8:2 | 128810 |
+| hector4 | 32 | 2:8:2 | 128810 |
+| hector5 | 32 | 2:8:2 | 128810 |
+| hector6 | 32 | 2:8:2 | 128810 |
+| hector7 | 32 | 2:8:2 | 128810 |
+| hector8 | 32 | 2:8:2 | 128810 |
+| helena1 | 32 | 2:8:2 | 128811 |
+| helena2 | 32 | 2:8:2 | 128811 |
+| helena3 | 32 | 2:8:2 | 128811 |
+| helena4 | 32 | 2:8:2 | 128811 |
+| helena5 | 32 | 2:8:2 | 128810 |
+| helena6 | 32 | 2:8:2 | 128811 |
+| helena7 | 32 | 2:8:2 | 128810 |
+| helena8 | 32 | 2:8:2 | 128811 |
+| paris1 | 32 | 2:8:2 | 128810 |
+| paris2 | 32 | 2:8:2 | 128810 |
+| paris3 | 32 | 2:8:2 | 128810 |
+| paris4 | 32 | 2:8:2 | 128810 |
+| paris5 | 32 | 2:8:2 | 128810 |
+| paris6 | 32 | 2:8:2 | 128810 |
+| paris7 | 32 | 2:8:2 | 128810 |
+| paris8 | 32 | 2:8:2 | 128810 |
+| hyperion2 | 64 | 2:16:2 | 257667 |
+| hyperion3 | 64 | 2:16:2 | 257667 |
+| hyperion4 | 64 | 2:16:2 | 257667 |
+| hyperion5 | 64 | 2:16:2 | 257667 |
+| hyperion6 | 64 | 2:16:2 | 257667 |
+| hyperion7 | 64 | 2:16:2 | 257667 |
+| hyperion8 | 64 | 2:16:2 | 257667 |
+==== cpu-ms ====
+| Node name | Thread count | Socket:Core:Thread | RAM (MB) |
+| iridium | 16 | 2:4:2 | 515977 |
+| orion1 | 40 | 2:10:2 | 128799 |
+| orion2 | 40 | 2:10:2 | 128799 |
+| orion3 | 40 | 2:10:2 | 128799 |
+| orion4 | 40 | 2:10:2 | 128799 |
+| orion5 | 40 | 2:10:2 | 128799 |
+| orion6 | 40 | 2:10:2 | 128799 |
+| orion7 | 40 | 2:10:2 | 128799 |
+| orion8 | 40 | 2:10:2 | 128799 |
+==== gpu-troja ====
+| Node name | Thread count | Socket:Core:Thread | RAM (MB) | Features | GPU type |
+| tdll-3gpu1 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
+| tdll-3gpu2 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
+| tdll-3gpu3 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
+| tdll-3gpu4 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 | NVIDIA A40 |
+| tdll-8gpu1 | 64 | 2:16:2 | 257666 | gpuram40G gpu_cc8.0 | NVIDIA A100 |
+| tdll-8gpu2 | 64 | 2:16:2 | 257666 | gpuram40G gpu_cc8.0 | NVIDIA A100 |
+| tdll-8gpu3 | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
+| tdll-8gpu4 | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
+| tdll-8gpu5 | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
+| tdll-8gpu6 | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
+| tdll-8gpu7 | 32 | 2:8:2 | 253725 | gpuram16G gpu_cc7.5 | NVIDIA Quadro P5000 |
+==== gpu-ms ====
+| Node name | Thread count | Socket:Core:Thread | RAM (MB) | Features |
+| dll-3gpu1 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 |
+| dll-3gpu2 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 |
+| dll-3gpu3 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 |
+| dll-3gpu4 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 |
+| dll-3gpu5 | 64 | 2:16:2 | 128642 | gpuram48G gpu_cc8.6 |
+| dll-4gpu1 | 40 | 2:10:2 | 187978 | gpuram24G gpu_cc8.6 |
+| dll-4gpu2 | 40 | 2:10:2 | 187978 | gpuram24G gpu_cc8.6 |
+| dll-8gpu1 | 64 | 2:16:2 | 515838 | gpuram24G gpu_cc8.0 |
+| dll-8gpu2 | 64 | 2:16:2 | 515838 | gpuram24G gpu_cc8.0 |
+| dll-8gpu3 | 32 | 2:8:2 | 257830 | gpuram16G gpu_cc8.6 |
+| dll-8gpu4 | 32 | 2:8:2 | 253721 | gpuram16G gpu_cc8.6 |
+| dll-8gpu5 | 40 | 2:10:2 | 385595 | gpuram16G gpu_cc7.5 |
+| dll-8gpu6 | 40 | 2:10:2 | 385595 | gpuram16G gpu_cc7.5 |
+| dll-10gpu1 | 32 | 2:8:2 | 257830 | gpuram16G gpu_cc8.6 |
+| dll-10gpu2 | 32 | 2:8:2 | 257830 | gpuram11G gpu_cc6.1 |
+| dll-10gpu3 | 32 | 2:8:2 | 257830 | gpuram11G gpu_cc6.1 |
+==== Submit nodes ====
 In order to submit a job you need to login to one of the head nodes:
@@ Line 14: / Line 105: @@
    lrc1.ufal.hide.ms.mff.cuni.cz
    lrc2.ufal.hide.ms.mff.cuni.cz
+   sol1.ufal.hide.ms.mff.cuni.cz
+   sol2.ufal.hide.ms.mff.cuni.cz
+   sol3.ufal.hide.ms.mff.cuni.cz
+   sol4.ufal.hide.ms.mff.cuni.cz
 ===== Basic usage =====
@@ Line 157: / Line 252: @@
 There are many more parameters available to use. For example:
-<code>srun -p cpu-troja --mem=64G --pty bash</code>
+**To get an interactive CPU job with 64GB of reserved memory:**
+<code>srun -p cpu-troja,cpu-ms --mem=64G --pty bash</code>
   * ''-p cpu-troja'' explicitly requires partition ''cpu-troja''. If not specified slurm will use default partition.
   * ''--mem=64G'' requires 64G of memory for the job
-To get interactive job with a single GPU of any kind:
+**To get interactive job with a single GPU of any kind:**
 <code>srun -p gpu-troja,gpu-ms --gres=gpu:1 --pty bash</code>
   * ''-p gpu-troja,gpu-ms'' require only nodes from these two partitions
@@ Line 174: / Line 270: @@
 <code>srun -p gpu-troja --constraint="gpuram48G|gpuram40G" --mem=64G --gres=gpu:2 --pty bash</code>
   * ''--constraint="gpuram48G|gpuram40G"'' only consider nodes that have either ''gpuram48G'' or ''gpuram40G'' feature defined
+==== Delete Job ====
+<code>scancel <job_id> </code>
 To see all the available options type:
 <code>man srun</code>
+===== See also =====
+https://www.msi.umn.edu/slurm/pbs-conversion

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences