Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
gpu [2017/03/16 17:09] kocmanek [Using cluster] |
gpu [2017/05/16 13:35] kocmanek [Servers with GPU units] |
||
---|---|---|---|
Line 5: | Line 5: | ||
===== Servers with GPU units ===== | ===== Servers with GPU units ===== | ||
- | | machine | GPU; [[https:// | + | | machine |
- | | titan-gpu | + | | titan | GeForce GTX 1080 Ti; cc6.1 | 1 | 12 GB | | |
- | | twister1 | + | | titan-gpu |
- | | twister2 | + | | twister1; twister2; |
- | | kronos-dev | Tesla K40c; cc3.5 | 1 | 12 GB | | | + | | iridium |
- | | iridium | + | | victoria; arc | GeForce GT 630; cc3.0 | 1 | 2 GB | desktop machine | |
- | | victoria | + | | athena |
- | | arc | GeForce GT 630; cc3.0 | 1 | 2 GB | + | | dll1; dll2 |
- | | athena | + | |
- | | dll1 | GeForce GTX 1080; cc6.1 | 8 | 8 GB each core | | | + | |
- | | dll2 | + | |
not used at the moment: GeForce GTX 570 (from twister2) | not used at the moment: GeForce GTX 570 (from twister2) | ||
Line 45: | Line 42: | ||
===== How to use cluster ===== | ===== How to use cluster ===== | ||
- | In this section will be explained how to use cluster properly. | + | In this section will be explained how to use cluster properly. |
==== TensorFlow Environment ==== | ==== TensorFlow Environment ==== | ||
Line 81: | Line 77: | ||
qsubmit --gpumem=2G --queue=" | qsubmit --gpumem=2G --queue=" | ||
| | ||
+ | It is recommended to use priority -100 if you are not rushing for the results and don't need to leap over your colleagues jobs. | ||
==== Basic commands ==== | ==== Basic commands ==== | ||
+ | |||
+ | lspci | ||
+ | # is any such hardware there? | ||
+ | nvidia-smi | ||
+ | # more details, incl. running processes on the GPU | ||
+ | # nvidia-* are typically located in /usr/bin | ||
+ | watch nvidia-smi | ||
+ | # For monitoring GPU activity in a separate terminal (thanks to Jindrich Libovicky for this!) | ||
+ | nvcc --version | ||
+ | # this should tell CUDA version | ||
+ | # nvcc is typically installed in / | ||
+ | theano-test | ||
+ | # dela to vubec neco uzitecneho? :-) | ||
+ | # theano-* are typically located in / | ||
+ | / | ||
+ | # shows CUDA capability etc. | ||
+ | | ||
+ | === Select GPU device === | ||
+ | |||
+ | Use variable CUDA_VISIBLE_DEVICES to constrain tensorflow to compute only on the selected one. For the use of first GPU use (GPU queue do this for you): | ||
+ | export CUDA_VISIBLE_DEVICES=0 | ||
+ | |||
+ | To list available devices, use: | ||
+ | / | ||
+ | |||
===== Performance tests ===== | ===== Performance tests ===== | ||
* [[http:// | * [[http:// | ||
- | In the following table is the experiment conducted by Tom Kocmi. You can replicate experiment: / | + | In the following table is the experiment conducted by Tom Kocmi. You can replicate experiment: / |
| machine | Setup; CPU/GPU; [[https:// | | machine | Setup; CPU/GPU; [[https:// | ||
| athena | | athena | ||
| dll2 | (2 GPU) GeForce GTX 1080; cc6.1 | | | dll2 | (2 GPU) GeForce GTX 1080; cc6.1 | | ||
+ | | titan | GeForce GTX 1080 Ti | | ||
| dll1 | (2 GPU) GeForce GTX 1080; cc6.1 | | | dll1 | (2 GPU) GeForce GTX 1080; cc6.1 | | ||
| dll2 | (2 GPU) GeForce GTX 1080; cc6.1 | | | dll2 | (2 GPU) GeForce GTX 1080; cc6.1 | | ||
Line 106: | Line 129: | ||
- | ===== Installed toolkits ===== | ||
- | |||
- | //This should mention where each interesting toolkit lives (on a particular machine).// | ||
- | |||
- | ==== TensorFlow ==== | ||
- | |||
- | [[https:// | ||
- | |||
- | OP: I created [[https:// | ||
- | |||
- | === Select GPU device === | ||
- | |||
- | Use variable CUDA_VISIBLE_DEVICES to constrain tensorflow to compute only on the selected one. For the use of first GPU use: | ||
- | < | ||
- | |||
- | To list available devices, use: | ||
- | < | ||
- | |||
- | ===== Basic commands ===== | ||
- | |||
- | lspci | ||
- | # is any such hardware there? | ||
- | nvidia-smi | ||
- | # more details, incl. running processes on the GPU | ||
- | # nvidia-* are typically located in /usr/bin | ||
- | watch nvidia-smi | ||
- | # For monitoring GPU activity in a separate terminal (thanks to Jindrich Libovicky for this!) | ||
- | nvcc --version | ||
- | # this should tell CUDA version | ||
- | # nvcc is typically installed in / | ||
- | theano-test | ||
- | # dela to vubec neco uzitecneho? :-) | ||
- | # theano-* are typically located in / | ||
- | / | ||
- | # shows CUDA capability etc. | ||
===== Links ===== | ===== Links ===== |