[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
grid [2018/06/26 16:21]
vodrazka [MS = Malá Strana (cpu-ms.q)]
grid [2018/07/15 21:18] (current)
popel [Job monitoring]
Line 288: Line 288:
 === Ssh to random sol === === Ssh to random sol ===
 Ondřej Bojar suggests to add the following alias to your .bashrc (cf. [[#​sshcwd]]):​ Ondřej Bojar suggests to add the following alias to your .bashrc (cf. [[#​sshcwd]]):​
-<​code>​alias cluster='​comp=$(($RANDOM ​/4095 +1)); ssh -o "​StrictHostKeyChecking no" sol$comp'</​code>​+<​code>​alias cluster='​comp=$(( ​(RANDOM ​% 10) +1)); ssh -o "​StrictHostKeyChecking no" sol$comp'</​code>​
  
 ===== Job monitoring ===== ===== Job monitoring =====
Line 294: Line 294:
   * ''​qstat [-u user]''​ -- print a list of running/​waiting jobs of a given user   * ''​qstat [-u user]''​ -- print a list of running/​waiting jobs of a given user
   * ''​qhost''​ -- print available/​total resources   * ''​qhost''​ -- print available/​total resources
-  * ''/​SGE/​REPORTER/​LRC-UFAL/​bin/​lrc_users_real_mem_usage -u user -w''​ -- current memory usage of a given user +  ​* ''​qacct -j job_id''​ -- print info even for ended job (for which ''​qstat -j job_id''​ does not work). See ''​man qacct''​ for more. 
-  * ''/​SGE/​REPORTER/​LRC-UFAL/​bin/​lrc_users_limits_requested -w''​ -- required resources of all users + 
-  * ''/​SGE/​REPORTER/​LRC-UFAL/​bin/​lrc_nodes_meminfo''​ -- memory usage of all nodes+  ​* ''/​opt/LRC/​REPORTER/​LRC-UFAL/​bin/​lrc_users_real_mem_usage -u user -w''​ -- current memory usage of a given user 
 +  * ''/​opt/LRC/​REPORTER/​LRC-UFAL/​bin/​lrc_users_limits_requested -w''​ -- required resources of all users 
 +  * ''/​opt/LRC/​REPORTER/​LRC-UFAL/​bin/​lrc_nodes_meminfo''​ -- memory usage of all nodes
     * mem_total:     * mem_total:
     * mem_free: total memory minus reserved memory (using ''​qsub -l mem_free''​) for each node     * mem_free: total memory minus reserved memory (using ''​qsub -l mem_free''​) for each node
     * act_mem_free:​ really free memory     * act_mem_free:​ really free memory
     * mem_used: really used memory     * mem_used: really used memory
-  * ''/​SGE/​REPORTER/​LRC-UFAL/​bin/​lrc_state_overview''​ -- overall summary (with per-user stats for users with running jobs) +  * ''/​opt/LRC/​REPORTER/​LRC-UFAL/​bin/​lrc_state_overview''​ -- overall summary (with per-user stats for users with running jobs) 
-  * ''​cat /SGE/​REPORTER/​LRC-UFAL/​stats/​userlist.weight''​ -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night+  * ''​cat /opt/LRC/​REPORTER/​LRC-UFAL/​stats/​userlist.weight''​ -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night 
   * [[http://​ufaladm2/​munin/​ufal.hide.ms.mff.cuni.cz/​lrc-headnode.ufal.hide.ms.mff.cuni.cz/​index.html|Munin:​ graph of cluster usage by day and user]] and  [[http://​ufaladm2/​munin/​ufal.hide.ms.mff.cuni.cz/​apophis.ufal.hide.ms.mff.cuni.cz/​index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network)   * [[http://​ufaladm2/​munin/​ufal.hide.ms.mff.cuni.cz/​lrc-headnode.ufal.hide.ms.mff.cuni.cz/​index.html|Munin:​ graph of cluster usage by day and user]] and  [[http://​ufaladm2/​munin/​ufal.hide.ms.mff.cuni.cz/​apophis.ufal.hide.ms.mff.cuni.cz/​index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network)
  

[ Back to the navigation ] [ Back to the content ]