[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
grid [2017/09/27 11:52]
popel
grid [2017/09/27 12:01]
popel
Line 112: Line 112:
 The purpose of these rules is to prevent your jobs to damage the work of your colleagues and to divide the resources among users in a fair way. The purpose of these rules is to prevent your jobs to damage the work of your colleagues and to divide the resources among users in a fair way.
  
-  * Read about our [[internal:linux-network|network]] first (so you know that e.g. reading big data from you home in 200 parallel jobs is not a good idea). Ask your colleagues (possibly via [[internal:mailing-lists|devel]]) if you are not sure (esp. if you plan to submit jobs with unusual/extreme disk/mem/CPU requirements)+  * Read about our [[internal:linux-network|network]] first (so you know that e.g. reading big data from your home in 200 parallel jobs is not a good idea). Ask your colleagues (possibly via [[internal:mailing-lists|devel]]) if you are not sureesp. if you plan to submit jobs with unusual/extreme disk/mem/CPU requirements. 
-  * While your jobs are running (or queued), check your jobs (esp. previously untested setups) and email (including [[internal:mailing-lists|devel]]) regularly. If you really need to leave e.g. for two-week vacation offline, consult it first with it@ufal (whether they can kill your jobs if needed).+  * While your jobs are running (or queued), check your jobs (esp. previously untested setups) and your email (esp. [[internal:mailing-lists|devel]]) regularly. If you really need to leave e.g. for two-week vacation offline, consult it first with it@ufal (whether they can kill your jobs if needed).
   * You can ssh to any cluster machine, which can be useful e.g. to diagnose what's happening there (using ''htop'' etc.).   * You can ssh to any cluster machine, which can be useful e.g. to diagnose what's happening there (using ''htop'' etc.).
   * However, **never execute any computing manually** on a cluster machine where you are sshed (i.e. not via ''qsub'' or ''qrsh''). If you break this rule, your task will take CPU and memory, but the SGE will not know, so it may schedule other users' jobs on the same machine and **their jobs may fail** or run slowly. The sol machines are an exception from this rule.   * However, **never execute any computing manually** on a cluster machine where you are sshed (i.e. not via ''qsub'' or ''qrsh''). If you break this rule, your task will take CPU and memory, but the SGE will not know, so it may schedule other users' jobs on the same machine and **their jobs may fail** or run slowly. The sol machines are an exception from this rule.

[ Back to the navigation ] [ Back to the content ]