Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
grid [2018/06/26 16:21] vodrazka [MS = Malá Strana (cpu-ms.q)] |
grid [2018/11/01 14:34] vodrazka [Installation] |
Some machines are at Malá Strana (ground floor, new server room built from Lindat budget), some are at Troja (5 km north-east). | Some machines are at Malá Strana (ground floor, new server room built from Lindat budget), some are at Troja (5 km north-east). |
If you need to quickly distinguish which machine is located where, you can use your knowledge of [[https://en.wikipedia.org/wiki/Trojan_War|Trojan war]]-related heroes, ''qhost -q'', or the tables below. | If you need to quickly distinguish which machine is located where, you can use your knowledge of [[https://en.wikipedia.org/wiki/Trojan_War|Trojan war]]-related heroes, ''qhost -q'', or the tables below. |
| |
| ====== AVX instructions ====== |
| |
==== Troja (cpu-troja.q) ==== | ==== Troja (cpu-troja.q) ==== |
^ Name ^ CPU type ^ GHz ^cores ^RAM(GB)^ note ^ | ^ Name ^ CPU type ^ GHz ^cores ^RAM(GB)^ note ^ |
| achilles[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | | | | achilles[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | AVX enabled | |
| hector[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | | | | hector[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | AVX enabled | |
| helena[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | | | | helena[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | AVX enabled | |
| paris[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | | | | paris[1-8] | Intel(R) Xeon(R) CPU E5-2630 v3 | 2.4 | 31 | 123 | AVX enabled | |
| |
==== MS = Malá Strana (cpu-ms.q) ==== | ==== MS = Malá Strana (cpu-ms.q) ==== |
| andromeda[1-13] | AMD Opteron | 2.8 | 7 | 30 | | | | andromeda[1-13] | AMD Opteron | 2.8 | 7 | 30 | | |
| lucifer[1-10] | Intel(R) Xeon(R) CPU E5620 | 2.4 | 15 | 122 | | | | lucifer[1-10] | Intel(R) Xeon(R) CPU E5620 | 2.4 | 15 | 122 | | |
| hydra[1-4] | AMD Opteron SSE4 AVX | 2.6 | 15 | 122 | | | | hydra[1-4] | AMD Opteron SSE4 AVX | 2.6 | 15 | 122 | AVX enabled | |
| orion[1-8] | Intel(R) Xeon(R) CPU E5-2630 v4 | 2.2 | 39 | 122 | | | | orion[1-8] | Intel(R) Xeon(R) CPU E5-2630 v4 | 2.2 | 39 | 122 | AVX enabled | |
| cosmos | Intel Xeon | 2.9 | 7 | 249 | | | | cosmos | Intel Xeon | 2.9 | 7 | 249 | | |
| belzebub | Intel Xeon SSE4 AVX | 2.9 | 31 | 249 | | | | belzebub | Intel Xeon SSE4 AVX | 2.9 | 31 | 249 | AVX enabled | |
| iridium | Intel Xeon SSE4 | 1.9 | 15 | 501 | | | | iridium | Intel Xeon SSE4 | 1.9 | 15 | 501 | | |
| |
| |
export LC_ALL=en_US.UTF-8 | export LC_ALL=en_US.UTF-8 |
| |
| If you are curious about purpose of .bashrc and .bash_profile and you need to know when they should be used you may read [[https://stackoverflow.com/a/415444|this]]. |
| |
===== Basic usage ===== | ===== Basic usage ===== |
# prepare a shell script describing your task | # prepare a shell script describing your task |
qsub -cwd -j y script.sh Hello World | qsub -cwd -j y script.sh Hello World |
# This submits your job to the default queue, which is currently ''cpu-ms.q''. | # This submits your job to the default queue, which is currently ''cpu-*.q''. |
# Usually, there is a free slot, so the job will be scheduled within few seconds. | # Usually, there is a free slot, so the job will be scheduled within few seconds. |
# We have used two handy qsub parameters: | # We have used two handy qsub parameters: |
=== Ssh to random sol === | === Ssh to random sol === |
Ondřej Bojar suggests to add the following alias to your .bashrc (cf. [[#sshcwd]]): | Ondřej Bojar suggests to add the following alias to your .bashrc (cf. [[#sshcwd]]): |
<code>alias cluster='comp=$(($RANDOM /4095 +1)); ssh -o "StrictHostKeyChecking no" sol$comp'</code> | <code>alias cluster='comp=$(( (RANDOM % 10) +1)); ssh -o "StrictHostKeyChecking no" sol$comp'</code> |
| |
===== Job monitoring ===== | ===== Job monitoring ===== |
* ''qstat [-u user]'' -- print a list of running/waiting jobs of a given user | * ''qstat [-u user]'' -- print a list of running/waiting jobs of a given user |
* ''qhost'' -- print available/total resources | * ''qhost'' -- print available/total resources |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_users_real_mem_usage -u user -w'' -- current memory usage of a given user | * ''qacct -j job_id'' -- print info even for ended job (for which ''qstat -j job_id'' does not work). See ''man qacct'' for more. |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_users_limits_requested -w'' -- required resources of all users | |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_nodes_meminfo'' -- memory usage of all nodes | * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_users_real_mem_usage -u user -w'' -- current memory usage of a given user |
| * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_users_limits_requested -w'' -- required resources of all users |
| * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_nodes_meminfo'' -- memory usage of all nodes |
* mem_total: | * mem_total: |
* mem_free: total memory minus reserved memory (using ''qsub -l mem_free'') for each node | * mem_free: total memory minus reserved memory (using ''qsub -l mem_free'') for each node |
* act_mem_free: really free memory | * act_mem_free: really free memory |
* mem_used: really used memory | * mem_used: really used memory |
* ''/SGE/REPORTER/LRC-UFAL/bin/lrc_state_overview'' -- overall summary (with per-user stats for users with running jobs) | * ''/opt/LRC/REPORTER/LRC-UFAL/bin/lrc_state_overview'' -- overall summary (with per-user stats for users with running jobs) |
* ''cat /SGE/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night | * ''cat /opt/LRC/REPORTER/LRC-UFAL/stats/userlist.weight'' -- all users sorted according to their activity (number of submitted jobs × their average duration), updated each night |
* [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/apophis.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network) | * [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html|Munin: graph of cluster usage by day and user]] and [[http://ufaladm2/munin/ufal.hide.ms.mff.cuni.cz/apophis.ufal.hide.ms.mff.cuni.cz/index.html|Munin monitoring of Apophis disk server]] (both accessible only from ÚFAL network) |
| |