Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
gpu [2017/08/28 14:49] kocmanek [Set-up CUDA and CUDNN] |
gpu [2017/10/17 16:38] popel [Rules] |
||
---|---|---|---|
Line 11: | Line 11: | ||
| twister1; twister2; kronos | Tesla K40c | cc3.5 | | | twister1; twister2; kronos | Tesla K40c | cc3.5 | | ||
| dll1; dll2 | GeForce GTX 1080 | cc6.1 | | | dll1; dll2 | GeForce GTX 1080 | cc6.1 | | ||
- | | titan | GeForce GTX 1080 Ti | cc6.1 | | + | | titan | GeForce GTX 1080 | cc6.1 | |
| dll3; dll4; dll5 | GeForce GTX 1080 Ti | cc6.1 | 10| 11 GB | dll3 has only 9 GPUs since 2017/07 | | | dll3; dll4; dll5 | GeForce GTX 1080 Ti | cc6.1 | 10| 11 GB | dll3 has only 9 GPUs since 2017/07 | | ||
+ | | dll6 | GeForce GTX 1080 Ti | cc6.1 | | ||
Desktop machines: | Desktop machines: | ||
Line 22: | Line 23: | ||
All machines have CUDA8.0 and should support both Theano and TensorFlow. | All machines have CUDA8.0 and should support both Theano and TensorFlow. | ||
- | === Disk space === | + | ===== Rules ===== |
- | All the GPU machines | + | * First, read [[internal: |
- | - '' | + | * All the rules from [[:Grid]] apply, even more strictly than for CPU because there are too many GPU users and not as many GPUs available. So as a reminder: always |
- | - '' | + | * Always specify the number of GPU cards (e.g. '' |
- | - ''/ | + | * If you need more than one GPU card, always require as many CPU cores as many GPU cards you need. E.g. < |
- | - '' | + | * For interactive jobs, you can use '' |
- | + | ||
- | === Individual acquisitions: | + | |
- | + | ||
- | There is an easy way to get one high-end | + | |
- | + | ||
- | Take care, however, to coordinate | + | |
- | + | ||
- | Known NVIDIA Academic Hardware Grants: | + | |
- | + | ||
- | * Ondřej Plátek | + | |
- | * Jan Hajič jr. - granted (early 2016) | + | |
- | + | ||
- | + | ||
- | | + | |
===== How to use cluster ===== | ===== How to use cluster ===== | ||
- | |||
- | In this section will be explained how to use cluster properly. | ||
==== Set-up CUDA and CUDNN ==== | ==== Set-up CUDA and CUDNN ==== | ||
Line 80: | Line 65: | ||
This environment have TensorFlow 1.0 and all necessary requirements for NeuralMonkey. | This environment have TensorFlow 1.0 and all necessary requirements for NeuralMonkey. | ||
+ | |||
+ | ==== Pytorch Environment ==== | ||
+ | |||
+ | If you want to use pytorch, there is a ready-made environment in | ||
+ | |||
+ | / | ||
+ | | ||
+ | It does rely on the CUDA and CuDNN setup above. | ||
==== Using cluster ==== | ==== Using cluster ==== | ||
Line 115: | Line 108: | ||
/ | / | ||
# shows CUDA capability etc. | # shows CUDA capability etc. | ||
+ | ssh dll1; ~popel/ | ||
+ | # who occupies which card on a given machine | ||
| | ||
=== Select GPU device === | === Select GPU device === | ||
- | Use variable CUDA_VISIBLE_DEVICES | + | The variable CUDA_VISIBLE_DEVICES |
- | export CUDA_VISIBLE_DEVICES=0 | + | |
To list available devices, use: | To list available devices, use: | ||
Line 166: | Line 160: | ||
GPU specs for those GPUs we have: | GPU specs for those GPUs we have: | ||
* [[http:// | * [[http:// | ||
+ | |||
+ | ==== Individual acquisitions: | ||
+ | |||
+ | There is an easy way to get one high-end GPU: [[https:// | ||
+ | |||
+ | Take care, however, to coordinate the grant applications a little, so that not too many arrive from UFAL within a short time: these grants are explicitly //not// intended to build GPU clusters, they are " | ||
+ | |||
+ | Known NVIDIA Academic Hardware Grants: | ||
+ | |||
+ | * Ondřej Plátek - granted (2015) | ||
+ | * Jan Hajič jr. - granted (early 2016) | ||
+ | |||
+ | |||
+ |