ANN: Docker image for CUDA packages

Hi all,

Many JuliaGPU packages are pretty hard to install, so I’ve been working a little on a Docker image which comes with some CUDA packages preinstalled: maleadt/juliagpu, requiring nvidia-docker:

$ docker pull maleadt/juliagpu
$ nvidia-docker run -it maleadt/juliagpu

The image currently comes with CUDAnative.jl and CuArrays.jl preinstalled. Pulling the image requires a single initialization step, after which the image is fully usable, but the image just prompts you to do that (more details on the README). The image is rebuilt weekly, and only gets pushed if all packages pass tests, so there’s some quality guarantee which you don’t necessarily get by just installing the same commands.

I’ll probably keep this image working until packages like CUDAnative are easy to install, but if there’s more interest we might maintain it longer. It is already hooked up to the JuliaGPU CI. If you want to add other packages, just make a PR :slight_smile:


Hi @maleadt

Is this still working and up to date? Can Install it and run flux on AWS?

I’ve been running into an issue with Docker and the NVIDIA driver, so CI has been failing lately. I’ll look into it.

EDIT: the container has been updated

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation:
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-linux-gnu

julia> Pkg.status()
3 required packages:
 - CUDAnative                    0.6.0
 - CuArrays                      0.4.0
 - GPUArrays                     0.2.2
8 additional packages:
 - Adapt                         0.2.0
 - CUDAapi                       0.4.0
 - CUDAdrv                       0.7.7
 - Compat                        0.55.0
 - LLVM                          0.5.1
 - NNlib                         0.2.3
 - Requires                      0.4.3
 - StaticArrays                  0.6.6

It does not include Flux, since that package doesn’t depend on CuArrays, but you can just install it in the container. Might want to re-commit the container after installing and precompiling Flux though (check gethostname for the current container name).

Thanks. Yea I saw the build was failing and wanted to check in.

Is there a specific AWS AMI that would be best to use?

No idea. Maybe @tk3369 can give some details on his experiment.

1 Like

I used this amazon image ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-20170414 (ami-efd0428f)

I didn’t write down all the steps but I do remember that I had to install several packages e.g. nvidia drivers. I might have even installed the latest version off the web but then later realized that was unnecessary and had a conflict with an existing one. Also, allocate at least 30GiB disk (or more depending on your problem size) for the instance. I did that a couple of times and ended up spending more time fixing that.

Once you get over the installation hurdle, it became a bit smoother - at least the simple examples worked. I didn’t go further than that, however.

Thank you.

The nvidia drivers were installed using docker? How long did the whole process take aprox take?

No, the drivers were installed with apt-get. It took me 3-4 hours messing with several different images to get it right.

Very helpful information, thanks.

The image now also includes Knet (with GPU support).

Upon opening a REPL, there’s now also a hint on how to commit the current state, making it possible to install/precompile packages and save that state:

If you want changes to persist (eg. after installing a package), run:
$ docker commit 1cfbc878f4f8 local/juliagpu
and use the local/juliagpu container instead.

A new version will be tagged soon.

I’ve changed the way this container works, using volumes to store data persistently instead of making users tag their own local/juliagpu container. This should make it much easier to keep state, update the container, etc.

I’ve also added a /data volume for importing data from the host system (see the README).