Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
grid [2017/10/16 18:47] popel [Installation] |
grid [2018/01/16 11:42] popel [Rules] |
* If your job needs more than one CPU (on a single machine) for most of the time, reserve the given number of CPU cores (and SGE slots) with <code>qsub -pe smp <number-of-CPU-cores></code> As you can see in [[#List of Machines]], the maximum is 32 cores. If your job needs e.g. up to 110% CPU most of the time and just occasionally 200%, it is OK to reserve just one core (so you don't waste). TODO: when using ''-pe smp -l mf=8G,amf=8G,h_vmem=12G'', which memory limits are per machine and which are per core? | * If your job needs more than one CPU (on a single machine) for most of the time, reserve the given number of CPU cores (and SGE slots) with <code>qsub -pe smp <number-of-CPU-cores></code> As you can see in [[#List of Machines]], the maximum is 32 cores. If your job needs e.g. up to 110% CPU most of the time and just occasionally 200%, it is OK to reserve just one core (so you don't waste). TODO: when using ''-pe smp -l mf=8G,amf=8G,h_vmem=12G'', which memory limits are per machine and which are per core? |
* If you are sure your job needs less than 1GB RAM, then you can skip this. Otherwise, if you need e.g. 8 GiB, you must always use ''qsub'' (or ''qrsh'') with ''-l mem_free=8G''. You should specify also ''act_mem_free'' with the same value and ''h_vmem'' with possibly a slightly bigger value. See [[#memory]] for details. TL;DR: <code>qsub -l mem_free=8G,act_mem_free=8G,h_vmem=12G</code> | * If you are sure your job needs less than 1GB RAM, then you can skip this. Otherwise, if you need e.g. 8 GiB, you must always use ''qsub'' (or ''qrsh'') with ''-l mem_free=8G''. You should specify also ''act_mem_free'' with the same value and ''h_vmem'' with possibly a slightly bigger value. See [[#memory]] for details. TL;DR: <code>qsub -l mem_free=8G,act_mem_free=8G,h_vmem=12G</code> |
* Be kind to your colleagues. If you are going to submit jobs that effectively occupy **more than one fifth of our cluster for more than several hours**, check if the cluster is free (with ''qstat -g c'' or ''qstat -u \*'') and/or ask your colleagues if they don't plan to use the cluster intensively in the near future. Note that if you allocate one slot (CPU core) on a machine, but (almost) all its RAM, you have effectively occupied the whole machine and all its cores. If you are submitting **more than 100 jobs**, consider using setting them a low priority (e.g. ''-p -1024'', see below) or use [[#qunhold]]. Don't submit more than ca 2000 jobs at once (this can overload the SGE). | * Be kind to your colleagues. If you are going to submit jobs that effectively occupy **more than one fifth of our cluster for more than several hours**, check if the cluster is free (with ''qstat -g c'' or ''qstat -u \*'') and/or ask your colleagues if they don't plan to use the cluster intensively in the near future. Note that if you allocate one slot (CPU core) on a machine, but (almost) all its RAM, you have effectively occupied the whole machine and all its cores. If you are submitting **more than 100 jobs**, consider using setting them a low priority (e.g. ''-p -1024'', see below) or use [[#qunhold]]. |
| * **Don't submit more than ca 5000 jobs at once**, even if you make sure that at most 100 are running/waiting and the rest is in the //hold// state (e.g. using ''qunhold''). More than 5000 jobs in the queue can overload the SGE, so then no one can execute ''qstat'' (or it takes too long). |
| |
| |
# immediately on the remote machine | # immediately on the remote machine |
history -a; | history -a; |
# setup the working directory by setting WD | # setup the working directory by setting WD, delete possible ".nfs/" |
ssh -X -Y -C -t $@ "WD='$PWD' /bin/bash --login -i"; | ssh -X -Y -C -t $@ "WD='${PWD/.nfs\//}' /bin/bash --login -i"; |
} | } |
| |