Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
gpu [2017/07/19 16:18] kocmanek [Performance tests] |
gpu [2018/03/21 15:29] ufal [Servers with GPU units] |
||
---|---|---|---|
Line 4: | Line 4: | ||
===== Servers with GPU units ===== | ===== Servers with GPU units ===== | ||
+ | GPU cluster '' | ||
- | | machine | + | | machine | GPU type | GPU driver version | [[https:// |
- | | titan | + | | dll1 | GeForce GTX 1080 | 384.69 | 6.1 | 8 | 8114 | 250 | |
- | | titan-gpu | + | | dll2 | GeForce GTX 1080 | 387.34 | 6.1 | 7 | 8114 | 250 | |
- | | twister1; twister2; kronos | + | | dll3 | GeForce GTX 1080 Ti | 375.66 | 6.1 | 9 | 11172 | 250 | |
- | | iridium | + | | dll4 | GeForce GTX 1080 Ti | 375.66 | 6.1 | 10 | 11172 | 250 | |
- | | victoria; arc | + | | dll5 | GeForce |
- | | athena | + | | dll6 | GeForce GTX 1080 Ti | 384.69 | 6.1 | 9 | 11172 | 122 | |
- | | dll1; dll2 | GeForce GTX 1080; cc6.1 | 8 | + | | titan | GeForce GTX 1080 | 381.22 | 6.1 | 1 | 8114 | 31 | |
- | | dll3; dll4; dll5 | GeForce GTX 1080 Ti; cc6.1 | 10 | 11 GB each core | | | + | | twister1 | Tesla K40c | 367.48 | 3.5 | 1 | 11439 | 47 | |
+ | | twister2 | Tesla K40c | 367.48 | 3.5 | 1 | 11439 | 47 | | ||
+ | | titan-gpu | GeForce GTX TITAN Z | 381.22 | 3.5 | 2 | 6082 | 31 | | ||
+ | | kronos | ||
+ | | iridium | Quadro K2000 | 367.48 | 3.0 | 1 | 1998 | 504 | | ||
- | not used at the moment: GeForce GTX 570 (from twister2) | + | |
+ | Desktop machines: | ||
+ | | machine | ||
+ | | victoria; arc | GeForce GT 630 | cc3.0 | 1 | 2 GB | desktop machine | | ||
+ | | athena | ||
+ | |||
+ | Not used at the moment: GeForce GTX 570 (from twister2) | ||
All machines have CUDA8.0 and should support both Theano and TensorFlow. | All machines have CUDA8.0 and should support both Theano and TensorFlow. | ||
- | Summary of future plans: | + | [[https://ufaladm2.ufal.hide.ms.mff.cuni.cz/munin/ufal.hide.ms.mff.cuni.cz/lrc-headnode.ufal.hide.ms.mff.cuni.cz/index.html# |
- | * Current Troja servers won't get any GPUs (the only option would be [[http://www.czc.cz/hp-quadro-k1200-4gb/171662/produkt? | + | |
- | * The old Quadro K2000 we have is a much more low end piece, so we can't test is in Troja. | + | |
- | * There is MetaCentrum which also has GPUs, so testing can be done there. | + | |
- | * It is impossible (wasteful in terms of space and forbidden by a dean regulation) to put non-rack machines to our servers rooms. So we won't be buying GeForce GTX 1080 (~20000CZK, out of stock now), for a non-rack machine since we most likely don't have any available. | + | |
- | * Yes, there are grant applications under review which include rack machines with GPUs, e.g. 5x2 or something like that; more will be known in 2017. | + | |
- | === Individual acquisitions: NVIDIA Academic Hardware Grants | + | ===== Rules ===== |
+ | * First, read [[internal:Linux network]] and [[: | ||
+ | * All the rules from [[:Grid]] apply, even more strictly than for CPU because there are too many GPU users and not as many GPUs available. So as a reminder: always use GPUs via '' | ||
+ | * Always specify the number of GPU cards (e.g. '' | ||
+ | * If you need more than one GPU card (on a single machine), always require as many CPU cores ('' | ||
+ | * For interactive jobs, you can use '' | ||
+ | * Note that the dll machines have typically 10 cards, but " | ||
- | There is an easy way to get one high-end GPU: [[https:// | + | ===== How to use cluster ===== |
- | Take care, however, to coordinate the grant applications a little, so that not too many arrive from UFAL within a short time: these grants are explicitly //not// intended to build GPU clusters, they are " | + | ==== Set-up CUDA and CUDNN ==== |
- | Known NVIDIA Academic Hardware Grants: | + | You should add the following commands into your ~/.bashrc |
- | | + | |
- | | + | |
+ | CUDA_DIR_OPT=/ | ||
+ | if [ -d " | ||
+ | CUDA_DIR=$CUDA_DIR_OPT | ||
+ | export CUDA_HOME=$CUDA_DIR | ||
+ | export THEANO_FLAGS=" | ||
+ | export PATH=$PATH: | ||
+ | export LD_LIBRARY_PATH=$LD_LIBRARY_PATH: | ||
+ | export CPATH=$CUDA_DIR/ | ||
+ | fi | ||
+ | When not using Theano, just Tensorflow this can be simplified to '' | ||
- | | + | TensorFlow 1.5 precompiled binaries need CUDA 9.0, for this you need to |
- | ===== How to use cluster ===== | + | export LD_LIBRARY_PATH=/ |
+ | |||
+ | You also need to use '' | ||
- | In this section will be explained how to use cluster properly. | ||
==== TensorFlow Environment ==== | ==== TensorFlow Environment ==== | ||
Line 61: | Line 84: | ||
This environment have TensorFlow 1.0 and all necessary requirements for NeuralMonkey. | This environment have TensorFlow 1.0 and all necessary requirements for NeuralMonkey. | ||
- | ==== Using cluster | + | ==== Pytorch Environment |
- | + | ||
- | Rule number one, always use the GPU queue (never log in machine by ssh). Always use qsub or qsubmit with proper arguments. | + | |
- | For testing and using the cluster interactively | + | If you want to use pytorch, there is a ready-made environment in |
- | | + | |
| | ||
- | For running experiments you must use qsub command: | + | It does rely on the CUDA and CuDNN setup above. |
- | qsub -q gpu.q -l gpu=1, | + | ==== Using cluster ==== |
- | + | ||
- | Cleaner way to use cluster is with / | + | As an alternative |
qsubmit --gpumem=2G --queue=" | qsubmit --gpumem=2G --queue=" | ||
| | ||
- | It is recommended to use priority -100 if you are not rushing for the results and don't need to leap over your colleagues jobs. | + | It is recommended to use priority |
==== Basic commands ==== | ==== Basic commands ==== | ||
Line 95: | Line 116: | ||
/ | / | ||
# shows CUDA capability etc. | # shows CUDA capability etc. | ||
+ | ssh dll1; ~popel/ | ||
+ | # who occupies which card on a given machine | ||
| | ||
=== Select GPU device === | === Select GPU device === | ||
- | Use variable CUDA_VISIBLE_DEVICES | + | The variable CUDA_VISIBLE_DEVICES |
- | export CUDA_VISIBLE_DEVICES=0 | + | |
To list available devices, use: | To list available devices, use: | ||
Line 146: | Line 168: | ||
GPU specs for those GPUs we have: | GPU specs for those GPUs we have: | ||
* [[http:// | * [[http:// | ||
+ | |||
+ | ==== Individual acquisitions: | ||
+ | |||
+ | There is an easy way to get one high-end GPU: [[https:// | ||
+ | |||
+ | Take care, however, to coordinate the grant applications a little, so that not too many arrive from UFAL within a short time: these grants are explicitly //not// intended to build GPU clusters, they are " | ||
+ | |||
+ | Known NVIDIA Academic Hardware Grants: | ||
+ | |||
+ | * Ondřej Plátek - granted (2015) | ||
+ | * Jan Hajič jr. - granted (early 2016) | ||
+ | |||
+ | |||
+ |