Knet and JLD question

I have been playing around with Knet, and it is great! However, I have run into a bit of trouble trying to move results from a machine with a GPU to another machine without an (NVIDIA) GPU.

Essentially, what I have done is run the example here on the machine with the GPU. Then I saved the results to a JLD file via JLD.save("myfile.jld", "theta", θ, "phi", ϕ). Then I moved the JLD file to the computer which does not have a GPU. On this other computer, it seems that doing things with theta or phi (other than loading them) produces an error, e.g.,:

julia> using Knet; import JLD; theta, phi = JLD.load("myfile.jld", "theta", "phi");

julia> theta
4-element Array{KnetArray{Float32,N} where N,1}:
Error showing value of type Array{KnetArray{Float32,N} where N,1}:
ERROR: UndefVarError: lib not defined
Stacktrace:
...

julia> VAE.decode(theta, zeros(Float32, 100))
ERROR: MethodError: no method matching *(::KnetArray{Float32,2}, ::Array{Float32,1})
...

julia> theta[1]
500×100 KnetArray{Float32,2}:
Error showing value of type KnetArray{Float32,2}:
ERROR: UndefVarError: lib not defined
Stacktrace:
...

julia> theta[1][1, 1]
ERROR: UndefVarError: lib not defined
Stacktrace:
...

Is there any way for me to use what I saved in the JLD file to encode and decode on the machine without the GPU? Did I really just save a bunch of pointers to GPU memory without actually saving the data? What is the best way to save these results so that I can then do things like encode/decode on another machine?

BTW, If I do the training on the machine without the GPU, then the above code works fine, but the types of things are different. E.g., typeof(theta) == Array{Array{Float32,N} where N,1} when trained on the machine without the GPU, whereas typeof(theta) == Array{KnetArray{Float32,N} where N,1} when trained on the machine with the GPU.

1 Like

I thought I’d answer my own question in case anyone finds it helpful in the future. Knet has its own mechanism for File IO. Basically, I replaced my calls to JLD.save and JLD.load with identical calls to Knet.save and Knet.load. After that, everything worked smoothly.

3 Likes

I always move everything back to CPU after training on a GPU, before saving the networks which I guess is what knet does behind the scenes.

1 Like