Differences

This shows you the differences between two versions of the page.

--- gpu [2017/03/16 17:01]
kocmanek [How to use cluster]
+++ gpu [2017/05/04 10:37]
kocmanek [Using cluster]
@@ Line 45: / Line 45: @@
 ===== How to use cluster =====
 In this section will be explained how to use cluster properly.
 ==== TensorFlow Environment ====
@@ Line 66: / Line 65: @@
 ==== Using cluster ====
+Rule number one, always use the GPU queue (never log in machine by ssh). Always use qsub or qsubmit with proper arguments.
+For testing and using the cluster interactively you can use qrsh (this should not be used for long running experiments since the console is not closed on the end of the experiment). Following command will assign you a GPU and creates interactive console.
+  qrsh -q gpu.q -l gpu=1 -pty yes bash
+For running experiments you must use qsub command:
+  qsub -q gpu.q -l gpu=1,gpu_cc_min3.5=1,gpu_ram=2G WHAT_SHOULD_BE_RUN
+Cleaner way to use cluster is with /home/bojar/tools/shell/qsubmit
+  qsubmit --gpumem=2G --queue="gpu.q" WHAT_SHOULD_BE_RUN
+It is recommended to use priority -100 if you are not rushing for the results and don't need to leap over your colleagues jobs.
+==== Basic commands ====
+  lspci
+    # is any such hardware there?
+  nvidia-smi
+    # more details, incl. running processes on the GPU
+    # nvidia-* are typically located in /usr/bin
+  watch nvidia-smi
+    # For monitoring GPU activity in a separate terminal (thanks to Jindrich Libovicky for this!)
+  nvcc --version
+    # this should tell CUDA version
+    # nvcc is typically installed in /usr/local/cuda/bin/
+  theano-test
+    # dela to vubec neco uzitecneho? :-)
+    # theano-* are typically located in /usr/local/bin/
+  /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery
+    # shows CUDA capability etc.
+=== Select GPU device ===
+Use variable CUDA_VISIBLE_DEVICES to constrain tensorflow to compute only on the selected one. For the use of first GPU use (GPU queue do this for you):
+  export CUDA_VISIBLE_DEVICES=0
+To list available devices, use:
+  /opt/cuda/samples/1_Utilities/deviceQuery/deviceQuery | grep ^Device
 ===== Performance tests =====
@@ Line 90: / Line 131: @@
-===== Installed toolkits =====
-//This should mention where each interesting toolkit lives (on a particular machine).//
-==== TensorFlow ====
-[[https://redmine.ms.mff.cuni.cz/projects/mmmt/repository/revisions/6a064187fc6959db9b77cf2d5350c5f4918a8067/entry/prepare_env.sh|This script]] installs TensorFlow 0.7.1 (and all other dependencies we need for Multimodal Translation) into `tf' and `tf-gpu' virtual environments. The GPU environment can be loaded by calling <code>source tf-gpu/bin/activate-gpu</code>
-OP: I created [[https://gist.github.com/oplatek/323b63b8f116cd3d78c0f492f78cc289|script]] which install Tensorflow 0.8 and test it if it uses GPU. TF is installed into `user` or `global` installation either for `python3.4` or `python2.7`
-=== Select GPU device ===
-Use variable CUDA_VISIBLE_DEVICES to constrain tensorflow to compute only on the selected one. For the use of first GPU use:
-<code>export CUDA_VISIBLE_DEVICES=0</code>
-To list available devices, use:
-<code>/opt/cuda/samples/1_Utilities/deviceQuery/deviceQuery | grep ^Device</code>
-===== Basic commands =====
-  lspci
-    # is any such hardware there?
-  nvidia-smi
-    # more details, incl. running processes on the GPU
-    # nvidia-* are typically located in /usr/bin
-  watch nvidia-smi
-    # For monitoring GPU activity in a separate terminal (thanks to Jindrich Libovicky for this!)
-  nvcc --version
-    # this should tell CUDA version
-    # nvcc is typically installed in /usr/local/cuda/bin/
-  theano-test
-    # dela to vubec neco uzitecneho? :-)
-    # theano-* are typically located in /usr/local/bin/
-  /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery
-    # shows CUDA capability etc.
 ===== Links =====

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences