Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 16.04 and CUDA #4430

Closed
guillermo-jimenez opened this issue Apr 25, 2016 · 16 comments
Closed

Ubuntu 16.04 and CUDA #4430

guillermo-jimenez opened this issue Apr 25, 2016 · 16 comments

Comments

@guillermo-jimenez
Copy link

Hi all,

I have started working with Theano recently and I have encountered a problem when using Theano on a freshly installed Ubuntu Mate 16.04 platform. I have a NVIDIA K2200M video card which is CUDA-capable and is correctly installed, as the nvidia-smi command shows:

+------------------------------------------------------+                       
| NVIDIA-SMI 361.42     Driver Version: 361.42         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2200M       Off  | 0000:01:00.0     Off |                  N/A |
| N/A   31C    P8    N/A /  N/A |    212MiB /  2047MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1090    G   /usr/lib/xorg/Xorg                             200MiB |
|    0      7931    G   /usr/lib/firefox/firefox                         1MiB |
+-----------------------------------------------------------------------------+

I have firstly installed NVIDIA-CUDA-toolkit from the Ubuntu Mate repo, getting the 7.5.18 version and then ran the code to test the GPU provided in www.deeplearning.net with the usual flags THEANO_FLAGS='device=gpu,floatX=float32 but the following error arose:

['nvcc', '-shared', '-O3', '-m64', '-Xcompiler', '-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden', '-Xlinker', '-rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray', '-I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda', '-I/usr/local/lib/python2.7/dist-packages/numpy/core/include', '-I/usr/include/python2.7', '-I/usr/local/lib/python2.7/dist-packages/theano/gof', '-o', '/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so', 'mod.cu', '-L/usr/lib', '-lcublas', '-lpython2.7', '-lcudart']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 1, 'for cmd', 'nvcc -shared -O3 -m64 -Xcompiler -DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray -I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/theano/gof -o /home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so mod.cu -L/usr/lib -lcublas -lpython2.7 -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: cuda unavailable)

I have also tried using the flag device=cuda, as suggested in the Issue 4384. Firstly I thought it was my OS' fault as installing NVIDIA-CUDA-toolkit from the official Ubuntu Mate repo did not create any folder /usr/local/cuda-X.Y, so I could not follow the instructions given in "Using the GPU". Because of that I installed the NVIDIA-CUDA-toolkit following a guide to install CUDA in Ubuntu 16.04 and I could install everything as sugested in the aforementioned guide to using the GPU, in the right folder. Nevertheless, exactly the same error arose. I have reasons to believe it has something to do with Theano's implementation rather than a bad Ubuntu integration between the GPU and the NVIDIA-CUDA-toolkit as when I installed CUDA from the official NVIDIA webpage, I could run seamlessly any CUDA sample free of hassle but Theano didn't seem to find the GPU.

I have also tried to follow any possible piece of advice given here and in the Theano Google Group (running sudo nvidia-smi, running an example before trying to compute anything, etc.).

I tried (with my limited programming skills) to track the problem and it seems that the problem is found when compiling the cuda_ndarray in the theano.sandbox.cuda module. The strangest thing is that even when installing the CUDA toolking from the repo, which installs nvcc in /usr/bin for some unknown reason), the command nvcc_compiler.is_nvcc_available() returns True so the CUDA compiler is found but, whereas nvcc seems to be able to compile CUDA code, it cannot compile cuda_ndarray.

Any advice?

@gokul-uf
Copy link
Contributor

Could you try downloading and installing CUDA -toolkit from the NVIDIA
website and try installing theano again?

S. Gokula Krishnan, Fourth-year Undergraduate student
Department of Computer Science
Birla Institute of Technology & Science, Pilani
Pilani Campus, Rajasthan, INDIA - 333031

On Mon, Apr 25, 2016 at 5:03 PM, Guillermo Jiménez Pérez <
notifications@github.com> wrote:

Hi all,

I have started working with Theano recently and I have encountered a
problem when using Theano on a freshly installed Ubuntu Mate 16.04
platform. I have a NVIDIA K2200M video card which is CUDA-capable and is
correctly installed, as the nvidia-smi command shows:

+------------------------------------------------------+
| NVIDIA-SMI 361.42 Driver Version: 361.42 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro K2200M Off | 0000:01:00.0 Off | N/A |
| N/A 31C P8 N/A / N/A | 212MiB / 2047MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1090 G /usr/lib/xorg/Xorg 200MiB |
| 0 7931 G /usr/lib/firefox/firefox 1MiB |
+-----------------------------------------------------------------------------+

I have firstly installed NVIDIA-CUDA-toolkit from the Ubuntu Mate repo,
getting the 7.5.18 version and then ran the code to test the GPU provided
in www.deeplearning.net
http://deeplearning.net/software/theano/tutorial/using_gpu.html#gpuarray
with the usual flags THEANO_FLAGS='device=gpu,floatX=float32 but the
following error arose:

['nvcc', '-shared', '-O3', '-m64', '-Xcompiler', '-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden', '-Xlinker', '-rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray', '-I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda', '-I/usr/local/lib/python2.7/dist-packages/numpy/core/include', '-I/usr/include/python2.7', '-I/usr/local/lib/python2.7/dist-packages/theano/gof', '-o', '/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so', 'mod.cu', '-L/usr/lib', '-lcublas', '-lpython2.7', '-lcudart']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 1, 'for cmd', 'nvcc -shared -O3 -m64 -Xcompiler -DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray -I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/theano/gof -o /home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so mod.cu -L/usr/lib -lcublas -lpython2.7 -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: cuda unavailable)

I have also tried using the flag device=cuda, as suggested in the Issue
4384 #4384. Firstly I thought it
was my OS' fault as installing NVIDIA-CUDA-toolkit from the official Ubuntu
Mate repo did not create any folder /usr/local/cuda-X.Y, so I could not
follow the instructions given in "Using the GPU"
http://deeplearning.net/software/theano/install.html#gpu-linux. Because
of that I installed the NVIDIA-CUDA-toolkit following a guide to install
CUDA in Ubuntu 16.04
https://www.pugetsystems.com/labs/articles/NVIDIA-CUDA-with-Ubuntu-16-04-beta-on-a-laptop-if-you-just-cannot-wait-775/
and I could install everything as sugested in the aforementioned guide to
using the GPU, in the right folder. Nevertheless, exactly the same error
arose. I have reasons to believe it has something to do with Theano's
implementation rather than a bad Ubuntu integration between the GPU and the
NVIDIA-CUDA-t oolkit a s when I installed CUDA from the official NVIDIA
webpage, I could run seamlessly any CUDA sample free of hassle but Theano
didn't seem to find the GPU.

I have also tried to follow any possible piece of advice given here and in
the Theano Google Group (running sudo nvidia-smi, running an example
before trying to compute anything, etc.).

I tried (with my limited programming skills) to track the problem and it
seems that the problem is found when compiling the cuda_ndarray in the
theano.sandbox.cuda module. The strangest thing is that even when
installing the CUDA toolking from the repo, which installs nvcc in
/usr/bin for some unknown reason), the command
nvcc_compiler.is_nvcc_available() returns True so the CUDA compiler is
found but, whereas nvcc seems to be able to compile CUDA code, it cannot
compile cuda_ndarray.

Any advice?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#4430

@guillermo-jimenez
Copy link
Author

guillermo-jimenez commented Apr 25, 2016

@gokul-uf Thank you for the quick response!

Done installing, right now compiling with no errors the CUDA samples but getting the same error when running the code for testing the GPU. I reinstalled Theano using sudo pip install theano -I

Edit: the "new" code for the error using the package provided by NVIDIA is:

['nvcc', '-shared', '-O3', '-m64', '-Xcompiler', '-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden', '-Xlinker', '-rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray', '-I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda', '-I/usr/local/lib/python2.7/dist-packages/numpy/core/include', '-I/usr/include/python2.7', '-I/usr/local/lib/python2.7/dist-packages/theano/gof', '-o', '/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so', 'mod.cu', '-L/usr/lib', '-lcublas', '-lpython2.7', '-lcudart']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 1, 'for cmd', 'nvcc -shared -O3 -m64 -Xcompiler -DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray -I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/theano/gof -o /home/bee/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.11+-64/cuda_ndarray/cuda_ndarray.so mod.cu -L/usr/lib -lcublas -lpython2.7 -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: cuda unavailable)

@gokul-uf
Copy link
Contributor

The URL links to testing the GPU when using the gpuarray backend, did you install that?
Otherwise, I think you may have to purge the nvidia drivers apt-get purge instead of apt-get remove so that everything is back to normal and then install from the NVIDIA website, both the drivers and the CUDA toolkit.

Another user had reported a similar issue some time ago. Theano/libgpuarray#19

@nouiz
Copy link
Member

nouiz commented Apr 25, 2016

Did you reboot after installing the cuda drivers? This is needed. Also be
sure to use the corresponding nvcc version. If they get mixed, it creates
problems sometimes.

On Mon, Apr 25, 2016 at 8:20 AM, Gokula Krishnan notifications@github.com
wrote:

The URL links to testing the GPU when using the gpuarray backend, did you
install that?
Otherwise, I think you may have to purge the nvidia drivers apt-get purge
instead of apt-get remove so that everything is back to normal and then
install from the NVIDIA website, both the drivers and the CUDA toolkit.

Another user had reported a similar issue some time ago.
Theano/libgpuarray#19 Theano/libgpuarray#19


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#4430 (comment)

@guillermo-jimenez
Copy link
Author

I am sorry for the delay in my response.

@gokul-uf I incorrectly linked the code for testing the libgpuarray backend but I was testing the code in the "CUDA Backend". I don't know if I am mistaken in the installation process but I'm not installing libgpuarray as I think it's not mandatory, my OS' version is not recommended and I don't have much experience building from source. Is it a must to install the mentioned library in order to use the GPU?

@nouiz Hi! Thanks for the support. I did reboot after installing the CUDA drivers. I also ran nvcc --version and everything seems to be running correctly.

@guillermo-jimenez
Copy link
Author

guillermo-jimenez commented Apr 26, 2016

@gokul-uf Also, I'll try to follow the reinstall procedure as you recommended, I'll keep you posted. Thanks!

EDIT: I did sudo apt-get purge nvidia-* and reinstalled but the same error code is thrown by Theano.

@guillermo-jimenez
Copy link
Author

guillermo-jimenez commented Apr 26, 2016

Update: while I waited for updates I used another recently released Deep Learning benchmark and I managed to run code with GPU support so I'd say it's definitely a Theano issue.

Update: It seems it's not.

@nouiz
Copy link
Member

nouiz commented Apr 26, 2016

There is no problem with Theano. I tried it and we need to work around 2 problems that aren't dependent of Theano.

  1. nvcc don't support the default g++ version. Follow this to fix this:

sudo apt-get install g++-4.9

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10

sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
sudo update-alternatives --set cc /usr/bin/gcc

sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
sudo update-alternatives --set c++ /usr/bin/g++

  1. There is a bug in the version of libc. You can work around it by using this extra Theano flag:
    (see Fix for glibc 2.23 #4369)

nvcc.flags=-D_FORCE_INLINES

@guillermo-jimenez
Copy link
Author

guillermo-jimenez commented Apr 26, 2016

@nouiz Thank you very much! Everything works perfectly now.

Closing the issue and appreciating the help of @nouiz and @gokul-uf.

@davidsvaughn
Copy link

seems this issue is closed, but just wanted to note another possible (though untested) solution: forcing cuda to work with gcc-5, by editing '/usr/local/cuda/include/host_config.h'.
See this post: https://www.pugetsystems.com/labs/articles/NVIDIA-CUDA-with-Ubuntu-16-04-beta-on-a-laptop-if-you-just-cannot-wait-775/

@guillermo-jimenez
Copy link
Author

@davidsvaughn Hi! Thanks for the support.

With respect to the information you provide, I followed that same link to install CUDA in my computer, and I edited the file you mention, but the result (in my case) was the same.

@nouiz
Copy link
Member

nouiz commented Apr 28, 2016

I added in a commit to a PR update to our installation instruction:

8f5bd5f

On Wed, Apr 27, 2016 at 12:25 PM, Guillermo Jiménez Pérez <
notifications@github.com> wrote:

@davidsvaughn https://github.com/davidsvaughn Hi! Thanks for the
support.

With respect to the information you provide, I followed that same link to
install CUDA in my computer, and I edited the file you mention, but the
result (in my case) was the same.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4430 (comment)

@guillermo-jimenez
Copy link
Author

@nouiz Thanks! It might be convenient to add that the procedure was only tested installing the NVIDIA package, instead of installing CUDA from the Ubuntu Repositories.

@purew
Copy link

purew commented May 17, 2016

Is there any chance of Theano soon working with the already packaged nvidia-361-driver and CUDA 7.5 in Ubuntu 16.04?

PyCUDA seems to work with the ubuntu-supplied packages as seen in https://gist.github.com/PureW/9b10181950a188d7ec24184a497345f5 while theano replies with "No device found".

EDIT: Ahh, it seems to work with the solution
THEANO_FLAGS=floatX=float32,device=gpu,nvcc.flags=-D_FORCE_INLINES
(observe nvcc.flags=-D_FORCE_INLINES) found here #4369

@orionr
Copy link

orionr commented May 24, 2016

Just for the record I was able to get this to work perfectly on Ubuntu 16.04 (with packaged drivers and CUDA) by updating ~/.theanorc to be

[global]
device=gpu
floatX=float32

[nvcc]
flags=-D_FORCE_INLINES

Then ran the script at http://deeplearning.net/software/theano/tutorial/using_gpu.html#using-gpu to confirm.

@darthdeus
Copy link

I can confirm that the above works for me, though to get rid of the deprecation warning one small change, from gpu to cuda as follows:

[global]
device=cuda
floatX=float32

[nvcc]
flags=-D_FORCE_INLINES

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants