[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
gpu [2017/10/12 13:42]
ufal [How to use cluster]
gpu [2017/10/17 16:37]
popel
Line 23: Line 23:
 All machines have CUDA8.0 and should support both Theano and TensorFlow. All machines have CUDA8.0 and should support both Theano and TensorFlow.
  
-=== Disk space === +===== Rules ===== 
-All the GPU machines are at Malá Strana (not at Troja), so you should not use ''/lnet/tspec/work/'', but you should use: +  * First, read [[internal:Linux network]] and [[:Grid]]. 
-''/lnet/spec/work/'' (alias ''/net/work/'') - Lustre disk space at Malá Strana +  * All the rules from [[:Grid]] apply, even more strictly than for CPU because there are too many GPU users and not as many GPUs available. So as a reminder: always use GPUs via ''qsub'' (or ''qrsh''), never via ''ssh''. You can ssh to any machine e.g. to run ''nvidia-smi'' or ''htop'', but not to start computing on GPU. Don't forget to specify you RAM requirements with e.g. ''-l mem_free=8G,act_mem_free=8G,h_vmem=12G''. 
-- ''/net/cluster/TMP'' - NFS hard disk for temporary filesso slower than Lustre for most tasks +  * Always specify the number of GPU cards (e.g. ''gpu=1''), the minimal Cuda capability you need (e.g. ''gpu_cc_min3.5=1'') and you GPU memory requirements (e.g. ''gpu_ram=2G''). Thus e.g. <code>qsub -q gpu.q -l gpu=1,gpu_cc_min3.5=1,gpu_ram=2G</code> 
-''/net/cluster/SSD''also NFSbut faster then TMP because of SSD +  * If you need more than one GPU card, always require as many CPU cores as many GPU cards you needE.g<code>qsub -q gpu.q -l gpu=4,gpu_cc_min3.5=1,gpu_ram=7G -pe smp 4</code> 
-''/COMP.TMP'' - local (for each machinespace for temporary files (use it instead of ''/tmp''; over-filling ''/COMP.TMP'' should not halt the system). +  * For interactive jobsyou can use ''qrsh''but make sure to end your job as soon as you don't need the GPU. E.g. <code>qrsh -q gpu.-l gpu=1,gpu_ram=2G -pty yes bash</code>
- +
-=== Individual acquisitions: NVIDIA Academic Hardware Grants == +
- +
-There is an easy way to get one high-end GPU: [[https://developer.nvidia.com/academic_gpu_seeding|ask NVIDIA for an Academic Hardware Grant]]All it takes is writing a short grant application (at most ~2 hrs of work from scratch; if you have a GAUK, ~15 minutes of copy-pasting)Due to the GPU housing issues (mainly rack space and cooling)Milan Fsaid we should request the Tesla-line cards (2017 check with Milan about this issue). If you want to have a look at an application, feel free to ask at hajicj@ufal.mff.cuni.cz :) +
- +
-Take carehowever, to coordinate the grant applications a little, so that not too many arrive from UFAL within a short time: these grants are explicitly //not// intended to build GPU clusters, they are "seeding" grants meant for researchers to try out GPUs (and fall in love with them, and buy a cluster later)If you are planning to submit the hardware grant, have submitted one, or have already been awarded one, please add yourself here. +
- +
-Known NVIDIA Academic Hardware Grants: +
- +
-  * Ondřej Plátek granted (2015) +
-  * Jan Hajič jr. - granted (early 2016) +
- +
- +
-  +
  
 ===== How to use cluster ===== ===== How to use cluster =====
- 
-In this section will be explained how to use cluster properly.  
  
 ==== Set-up CUDA and CUDNN ==== ==== Set-up CUDA and CUDNN ====
Line 176: Line 160:
 GPU specs for those GPUs we have: GPU specs for those GPUs we have:
   * [[http://www.nvidia.com/content/PDF/kepler/Tesla-K40-Active-Board-Spec-BD-06949-001_v03.pdf|Tesla K40c]]   * [[http://www.nvidia.com/content/PDF/kepler/Tesla-K40-Active-Board-Spec-BD-06949-001_v03.pdf|Tesla K40c]]
 +
 +==== Individual acquisitions: NVIDIA Academic Hardware Grants ====
 +
 +There is an easy way to get one high-end GPU: [[https://developer.nvidia.com/academic_gpu_seeding|ask NVIDIA for an Academic Hardware Grant]]. All it takes is writing a short grant application (at most ~2 hrs of work from scratch; if you have a GAUK, ~15 minutes of copy-pasting). Due to the GPU housing issues (mainly rack space and cooling), Milan F. said we should request the Tesla-line cards (2017 check with Milan about this issue). If you want to have a look at an application, feel free to ask at hajicj@ufal.mff.cuni.cz :)
 +
 +Take care, however, to coordinate the grant applications a little, so that not too many arrive from UFAL within a short time: these grants are explicitly //not// intended to build GPU clusters, they are "seeding" grants meant for researchers to try out GPUs (and fall in love with them, and buy a cluster later). If you are planning to submit the hardware grant, have submitted one, or have already been awarded one, please add yourself here.
 +
 +Known NVIDIA Academic Hardware Grants:
 +
 +  * Ondřej Plátek - granted (2015)
 +  * Jan Hajič jr. - granted (early 2016)
 +
 +
 +

[ Back to the navigation ] [ Back to the content ]