[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
grid [2017/09/27 12:01]
popel
grid [2017/09/27 14:01]
popel [Rules]
Line 119: Line 119:
   * **Specify the memory and CPU requirements** (if higher than the defaults) and **don't exceed them**.   * **Specify the memory and CPU requirements** (if higher than the defaults) and **don't exceed them**.
     * If your job needs more than one CPU (on a single machine) for most of the time, reserve the given number of CPU cores (and SGE slots) with <code>qsub -pe smp <number-of-CPU-cores></code> (As you can see in [[#List of Machines]], the maximum is 32 cores). If your job needs e.g. up to 110% CPU most of the time and just occasionally 200%, it is OK to reserve just one core (so you don't waste).     * If your job needs more than one CPU (on a single machine) for most of the time, reserve the given number of CPU cores (and SGE slots) with <code>qsub -pe smp <number-of-CPU-cores></code> (As you can see in [[#List of Machines]], the maximum is 32 cores). If your job needs e.g. up to 110% CPU most of the time and just occasionally 200%, it is OK to reserve just one core (so you don't waste).
-    * <code>qsub -hard -l mem_free=8G -l act_mem_free=8G -l h_vmem=8G</code>  +    * If you are sure your job needs less than 1GB RAM, then you can skip this. Otherwise, if you need e.g. 8 GiB, you must always use ''qsub'' (or ''qrsh'') with ''-l mem_free=8G''. You should specify also ''act_mem_free'' with the same value and ''h_vmem'' with possibly a slightly bigger value. See [[#memory]] for details. TL;DR: <code>qsub -hard -l mem_free=8G,act_mem_free=8G,h_vmem=12G</code>  
 +  * Be kind to your colleagues. If you are going to submit jobs that effectively take more than one fifth of our cluster for more than several hours, check if it is free (with ''qstat -g c'' or ''qstat -u \*''), ask your colleagues. Note that if you allocate one slot (CPU core) on a machine, but (almost) all its RAM, you have effectively occupied the whole machine and all its cores.
  
 +  
 Další doporučení: Další doporučení:
   * Uklízet po sobě lokální data, protože jinak si tam už nikdo nic užitečného nepustí.   * Uklízet po sobě lokální data, protože jinak si tam už nikdo nic užitečného nepustí.
   * Vyhnout se hodně divokému paralelnímu přístupu ke sdíleným diskům. NFS server to pak nepěkně zpomalí pro všechny. Distribuujte tedy i data.   * Vyhnout se hodně divokému paralelnímu přístupu ke sdíleným diskům. NFS server to pak nepěkně zpomalí pro všechny. Distribuujte tedy i data.
   * Pokud chci spouštět úlohy, které poběží dlouhou dobu (hodiny, dny), nepustím je všechny najednou, aby cluster mohli využívat i ostatní.   * Pokud chci spouštět úlohy, které poběží dlouhou dobu (hodiny, dny), nepustím je všechny najednou, aby cluster mohli využívat i ostatní.
 +
 +=== Memory ===
 +
 +mem_free (or mf): this is a 'consumable resource' tracked by SGE.
 +  It affects job scheduling. Every machine as an initial value assigned.
 +  When you specify
 +    qsub -l mem_free=4G
 +  SGE finds a machine with mem_free >= 4GB, and subtracts 4GB from it.
 +
 +  This limit is not enforced, so if a job exceeds this limit, the
 +  SGE value of mem_free may not represent the real free memory.
 +
 +  Default value is 1GB.
  
 ===== Advanced usage ===== ===== Advanced usage =====

[ Back to the navigation ] [ Back to the content ]