Both sides previous revision
Previous revision
|
Next revision
Both sides next revision
|
grid [2018/07/06 11:40] lauschmannova [Advanced usage] there are now 10 different sols rather than 8 |
grid [2018/07/15 20:56] lauschmannova [Job monitoring] /SGE/ seems to have been replaced with /opt/LRC |
* ''qstat [-u user]'' -- print a list of running/waiting jobs of a given user | * ''qstat [-u user]'' -- print a list of running/waiting jobs of a given user |
* ''qhost'' -- print available/total resources | * ''qhost'' -- print available/total resources |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_users_real_mem_usage -u user -w'' -- current memory usage of a given user | |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_users_limits_requested -w'' -- required resources of all users | * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_users_real_mem_usage -u user -w'' -- current memory usage of a given user |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_nodes_meminfo'' -- memory usage of all nodes | * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_users_limits_requested -w'' -- required resources of all users |
| * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_nodes_meminfo'' -- memory usage of all nodes |
* mem_total: | * mem_total: |
* mem_free: total memory minus reserved memory (using ''qsub -l mem_free'') for each node | * mem_free: total memory minus reserved memory (using ''qsub -l mem_free'') for each node |
* act_mem_free: really free memory | * act_mem_free: really free memory |
* mem_used: really used memory | * mem_used: really used memory |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_state_overview'' -- overall summary (with per-user stats for users with running jobs) | * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_state_overview'' -- overall summary (with per-user stats for users with running jobs) |
* ''cat /SGE/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night | * ''cat /opt/LRC/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night |
* [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/apophis.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network) | * [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/apophis.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network) |
| |