[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
gpu [2018/04/16 11:35]
kocmanek [Performance tests]
gpu [2018/06/12 13:50]
machacek.dominik [Servers with GPU units]
Line 4: Line 4:
  
 ===== Servers with GPU units ===== ===== Servers with GPU units =====
-GPU cluster ''gpu.q'' at Malá Strana:+GPU cluster ''gpu-ms.q'' at Malá Strana:
 | machine | GPU type | GPU driver version | [[https://en.wikipedia.org/wiki/CUDA#GPUs_supported|cc]] | GPU cnt | GPU RAM (GB) | machine RAM (GB)| | machine | GPU type | GPU driver version | [[https://en.wikipedia.org/wiki/CUDA#GPUs_supported|cc]] | GPU cnt | GPU RAM (GB) | machine RAM (GB)|
-| dll1 |  GeForce GTX 1080 |  384.69 |  6.1 |  8 |  8 |  250 +| dll1 |  GeForce GTX 1080 |  396.24 |  6.1 |  8 |  8 |  249 
-| dll2 |  GeForce GTX 1080 |  387.34 |  6.1 |  8 |  8 |  250 +| dll2 |  GeForce GTX 1080 |  396.24 |  6.1 |  8 |  8 |  249 
-| dll3 |  GeForce GTX 1080 Ti |  375.66 |  6.1 |  9 |  11 |  250 +| dll3 |  GeForce GTX 1080 Ti |  396.24 |  6.1 |  9 |  11 |  249 
-| dll4 |  GeForce GTX 1080 Ti |  375.66 |  6.1 |  10 |  11 |  250 +| dll4 |  GeForce GTX 1080 Ti |  396.24 |  6.1 |  10 |  11 |  249 
-| dll5 |  GeForce GTX 1080 Ti |  384.69 |  6.1 |  10 |  11 |  250 +| dll5 |  GeForce GTX 1080 Ti |  396.24 |  6.1 |  10 |  11 |  249 
-| dll6 |  GeForce GTX 1080 Ti |  384.69 |  6.1 |  9 |  11 |  122 +| dll6 |  GeForce GTX 1080 Ti |  396.24 |  6.1 |  9 |  11 |  123 | 
-| gpu |  GeForce GTX TITAN Z |  381.22 |  3.5 |  2 |  6 |  31 + 
-| iridium |  Quadro K2000 |  367.48 |  3.0 |  1 |  2 |  504 |+To be migrated to new cluster: 
 + 
 +titan-gpu |  GeForce GTX TITAN Z |  381.22 |  3.5 |  2 |  6 |  31 |
 | kronos |  GeForce GTX 1080 Ti |  384.81 |  6.1 |  1 |  11 |  125 | | kronos |  GeForce GTX 1080 Ti |  384.81 |  6.1 |  1 |  11 |  125 |
 | titan |  GeForce GTX 1080 |  381.22 |  6.1 |  1 |  8 |  31 | | titan |  GeForce GTX 1080 |  381.22 |  6.1 |  1 |  8 |  31 |
-| twister1 |  Tesla K40c |  367.48 |  3.5 |  1 |  11 |  47 | +| twister1 |  Tesla K40c |  |  |  1 | 11 |  47 | 
-| twister2 |  Quadro P5000 |  367.48 |  6.|  1 |  17 |  47 |+| twister2 |  Tesla K40c |  384.81 |  3.|  1 |  11 |  47 | 
  
 Desktop machines: Desktop machines:
Line 64: Line 67:
 You also need to use ''qsub -q gpu.q@dll[256]'' because only those machines have drivers which support CUDA 9. You also need to use ''qsub -q gpu.q@dll[256]'' because only those machines have drivers which support CUDA 9.
  
-**Testing configuration (so far on twister2 only)**+**THE NEW CLUSTER (SGE 8.1.9)**
  
-Multiple versions of ''cuda'' and ''cudnn'' can be accessed in ''/opt''. +Multiple versions of ''cuda'' can be accessed in ''/opt/cuda''**Compared to the old cluster there is a difference in setting the CUDA_DIR_OPT variable!!**
-System default version for both libraries is configured in ''/etc/ld.so.conf.d/cuda.conf'' as:+
  
-  /opt/cuda/lib64 +You need to set library path from your ''~/.bashrc'':
-  /opt/cuda/extras/CUPTI/lib64 +
-  /opt/cudnn/lib64 +
- +
-Actual version used depends on the link in ''/opt''. For example: +
- +
-  ls -l /opt +
-  ... +
-  lrwxrwxrwx 1 root root  8 dub  9 12:30 cuda -> cuda-9.0 +
-  lrwxrwxrwx 1 root root  9 dub  9 12:32 cudnn -> cudnn-7.1 +
-  ... +
-   +
-This means that the system is using ''cuda 9.0'' and ''cudnn 7.1''+
- +
-If system default version does not work for you, you can set library path from your ''~/.bashrc''.+
  
 +  CUDNN_version=7.0
 +  CUDA_version=9.0
 +  CUDA_DIR_OPT=/opt/cuda/$CUDA_version
 +  if [ -d "$CUDA_DIR_OPT" ] ; then
 +    CUDA_DIR=$CUDA_DIR_OPT
 +    export CUDA_HOME=$CUDA_DIR
 +    export THEANO_FLAGS="cuda.root=$CUDA_HOME,device=gpu,floatX=float32"
 +    export PATH=$PATH:$CUDA_DIR/bin
 +    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_DIR/cudnn/$CUDNN_version/lib64:$CUDA_DIR/lib64
 +    export CPATH=$CUDA_DIR/cudnn/$CUDNN_version/include:$CPATH
 +  fi
  
 +  * When not using Theano, just Tensorflow this can be simplified to ''export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/9.0/lib64:/opt/cuda/9.0/cudnn/7.0/lib64''.
  
 +  * There is no default and you always need to set ''LD_LIBRARY_PATH'' explicitly.
  
 +  * Note that ''cudnn'' library is compiled for specific version of ''cuda''. If you need specific version of ''cudnn'', you can look in ''/opt/cuda/$CUDA_version/cudnn/'' whether it is available for given ''$CUDA_version''.
  
  
Line 106: Line 108:
 And then you can activate your environment: And then you can activate your environment:
  
-  source activate tf1 +  source activate tf18 
-  source activate tf1cpu+  source activate tf18cpu
  
-This environment have TensorFlow 1.0 and all necessary requirements for NeuralMonkey.+This environment have TensorFlow 1.8.0 and all necessary requirements for NeuralMonkey.
  
 ==== Pytorch Environment ==== ==== Pytorch Environment ====

[ Back to the navigation ] [ Back to the content ]