Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
gpu [2017/10/17 16:39] popel [Using cluster] |
gpu [2018/03/21 15:29] ufal [Servers with GPU units] |
||
---|---|---|---|
Line 6: | Line 6: | ||
GPU cluster '' | GPU cluster '' | ||
- | | machine | + | | machine | GPU type | GPU driver version |
- | | iridium | + | | dll1 | GeForce GTX 1080 | 384.69 | 6.1 | 8 | 8114 | 250 | |
- | | titan-gpu | + | | dll2 | GeForce GTX 1080 | 387.34 | 6.1 | 7 | 8114 | 250 | |
- | | twister1; twister2; kronos | + | | dll3 | GeForce GTX 1080 Ti | 375.66 | 6.1 | 9 | 11172 | 250 | |
- | | dll1; dll2 | GeForce GTX 1080 | cc6.1 | 8| 8 GB | | | + | | dll4 | GeForce GTX 1080 Ti | 375.66 | 6.1 | 10 | 11172 | 250 | |
- | | titan | + | | dll5 | GeForce GTX 1080 Ti | 384.69 | 6.1 | 10 | 11172 | 250 | |
- | | dll3; dll4; dll5 | GeForce GTX 1080 Ti | cc6.1 | 10| 11 GB | dll3 has only 9 GPUs since 2017/ | + | | dll6 | GeForce GTX 1080 Ti | 384.69 | 6.1 | 9 | 11172 | 122 | |
- | | dll6 | GeForce GTX 1080 Ti | cc6.1 | | + | | titan | GeForce GTX 1080 | 381.22 | 6.1 | 1 | 8114 | 31 | |
+ | | twister1 | Tesla K40c | 367.48 | 3.5 | 1 | 11439 | 47 | | ||
+ | | twister2 | Tesla K40c | 367.48 | 3.5 | 1 | 11439 | 47 | | ||
+ | | titan-gpu | GeForce GTX TITAN Z | 381.22 | 3.5 | 2 | 6082 | 31 | | ||
+ | | kronos | ||
+ | | iridium | Quadro K2000 | 367.48 | 3.0 | 1 | 1998 | 504 | | ||
Desktop machines: | Desktop machines: | ||
Line 22: | Line 28: | ||
Not used at the moment: GeForce GTX 570 (from twister2) | Not used at the moment: GeForce GTX 570 (from twister2) | ||
All machines have CUDA8.0 and should support both Theano and TensorFlow. | All machines have CUDA8.0 and should support both Theano and TensorFlow. | ||
+ | |||
+ | [[https:// | ||
+ | |||
===== Rules ===== | ===== Rules ===== | ||
* First, read [[internal: | * First, read [[internal: | ||
* All the rules from [[:Grid]] apply, even more strictly than for CPU because there are too many GPU users and not as many GPUs available. So as a reminder: always use GPUs via '' | * All the rules from [[:Grid]] apply, even more strictly than for CPU because there are too many GPU users and not as many GPUs available. So as a reminder: always use GPUs via '' | ||
- | * Always specify the number of GPU cards (e.g. '' | + | * Always specify the number of GPU cards (e.g. '' |
- | * If you need more than one GPU card, always require as many CPU cores as many GPU cards you need. E.g. < | + | * If you need more than one GPU card (on a single machine), always require as many CPU cores ('' |
- | * For interactive jobs, you can use '' | + | * For interactive jobs, you can use '' |
+ | * Note that the dll machines have typically 10 cards, but " | ||
===== How to use cluster ===== | ===== How to use cluster ===== | ||
Line 34: | Line 44: | ||
==== Set-up CUDA and CUDNN ==== | ==== Set-up CUDA and CUDNN ==== | ||
- | You can add following | + | You should |
CUDNN_version=6.0 | CUDNN_version=6.0 | ||
Line 47: | Line 57: | ||
export CPATH=$CUDA_DIR/ | export CPATH=$CUDA_DIR/ | ||
fi | fi | ||
+ | |||
+ | When not using Theano, just Tensorflow this can be simplified to '' | ||
+ | |||
+ | TensorFlow 1.5 precompiled binaries need CUDA 9.0, for this you need to | ||
+ | |||
+ | export LD_LIBRARY_PATH=/ | ||
+ | |||
+ | You also need to use '' | ||
==== TensorFlow Environment ==== | ==== TensorFlow Environment ==== | ||
Line 80: | Line 98: | ||
qsubmit --gpumem=2G --queue=" | qsubmit --gpumem=2G --queue=" | ||
| | ||
- | It is recommended to use priority -100 if you are not rushing for the results and don't need to leap over your colleagues jobs. | + | It is recommended to use priority |
==== Basic commands ==== | ==== Basic commands ==== | ||