Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
grid [2019/02/14 17:49] dusek [Advanced usage] |
grid [2020/05/06 14:44] ufal [Other] |
''-q '*@hector[14]''' to submit on hector1 or hector4, | ''-q '*@hector[14]''' to submit on hector1 or hector4, |
''-q '[tm]*@!(hector*|iridium)''' to submit on any troja/ms machine except hectors and iridium. | ''-q '[tm]*@!(hector*|iridium)''' to submit on any troja/ms machine except hectors and iridium. |
However, usually you should specify just the queue (troja-all.q vs. ms-all.q), not a particular machine, and instead use ''-l'' to specify the needed resources in a general way. | However, usually you should specify just the queue (cpu-troja.q vs. cpu-ms.q), not a particular machine, and instead use ''-l'' to specify the needed resources in a general way. |
| |
''qsub **-l** ...'' | ''qsub **-l** ...'' |
| |
''qsub **-b** y'' | ''qsub **-b** y'' |
Treat ''script.sh'' (or whatever is the name of the command you execute) as a binary, i.e. don't search for [[#in-script options]] within the file, don't transfer it to the qmaster and then to the execution node. This makes the execution a bit faster and it may prevent some rare but hard-to-detect errors caused SGE interpreting the script. The script must be available on the execution node via NFS, Lustre (which is our case), etc. With ''-b y'' (shortcut for ''-b yes''), ''script.sh'' can be a script or a binary. With ''-b n'' (which is the default for ''qsub''), ''script.sh'' must be a script (text file). | Treat ''script.sh'' (or whatever is the name of the command you execute) as a binary, i.e. don't search for [[#in-script options]] within the file, don't transfer it to the qmaster and then to the execution node. This makes the execution a bit faster and it may prevent some rare but hard-to-detect errors caused SGE interpreting the script. The script must be available on the execution node via NFS, Lustre (which is our case), etc. With ''-b y'' (shortcut for ''-b yes''), ''script.sh'' can be an executable script or a binary (and you must provide full path, e.g. ''./script.sh''). With ''-b n'' (which is the default for ''qsub''), ''script.sh'' must be a script (text file). |
| |
''qsub **-M** popel@ufal.mff.cuni.cz,rosa@ufal.mff.cuni.cz **-m** beas'' | ''qsub **-M** popel@ufal.mff.cuni.cz,rosa@ufal.mff.cuni.cz **-m** beas'' |
* ''cat /opt/LRC/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night | * ''cat /opt/LRC/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night |
| |
* [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/apophis.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network) | * [[https://ufaladm2.ufal.hide.ms.mff.cuni.cz/munin/ufal.hide.ms.mff.cuni.cz/lrc-master.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[https://ufaladm2.ufal.hide.ms.mff.cuni.cz/munin/ufal.hide.ms.mff.cuni.cz/nfs-core.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of disk storage]] (both accessible only from ÚFAL network) |
| |
===== Profiling ===== | ===== Profiling ===== |
===== Other ===== | ===== Other ===== |
* There is a **great course [[http://ufal.mff.cuni.cz/courses/npfl102|Data intensive computing]]**, see the 2016 handouts if you missed the course. It covers the usage of [[http://spark.apache.org/|Spark]] (MapReduce/Hadoop alternative, but better) and HDFS (Hadoop filesystem). | * There is a **great course [[http://ufal.mff.cuni.cz/courses/npfl102|Data intensive computing]]**, see the 2016 handouts if you missed the course. It covers the usage of [[http://spark.apache.org/|Spark]] (MapReduce/Hadoop alternative, but better) and HDFS (Hadoop filesystem). |
* This course had used a special **DLRC (Demo LRC) cluster** (students had to login with ''ssh -p 11422 ufallab.ms.mff.cuni.cz'' and special NPFL102-only LDAP logins) with six virtual machines on one physical. During the years when NPFL102 is not taught (e.g. 2017), the DLRC cluster has just one virtual machine. | * **Note:** some hadoop basics and a lot of NoSQL technologies are covered by [[https://is.cuni.cz/studium/predmety/index.php?do=predmet&kod=NDBI040|Big Data Management and NoSQL Databases]] |
* **Note:** soma hadoop basics and a lot of NoSQL technologies are covered by [[https://is.cuni.cz/studium/predmety/index.php?do=predmet&kod=NDBI040|Big Data Management and NoSQL Databases]] | * There is a special cluster for Mgr (and Bc) students (but not for PhD and UFAL members): http://aic.ufal.mff.cuni.cz/ |
* You can use environment variables ''$JOB_ID'', ''$JOB_NAME''. | * You can use environment variables ''$JOB_ID'', ''$JOB_NAME''. |
* One job can submit other jobs (but be careful with recursive:-)). A job submitted to the CPU cluster may submit GPU jobs (to the ''qpu.q'' queue). | * One job can submit other jobs (but be careful with recursive:-)). A job submitted to the CPU cluster may submit GPU jobs (to the ''qpu.q'' queue). |