Knet and JLD question

omalled · March 26, 2019, 11:32pm

I have been playing around with Knet, and it is great! However, I have run into a bit of trouble trying to move results from a machine with a GPU to another machine without an (NVIDIA) GPU.

Essentially, what I have done is run the example here on the machine with the GPU. Then I saved the results to a JLD file via JLD.save("myfile.jld", "theta", θ, "phi", ϕ). Then I moved the JLD file to the computer which does not have a GPU. On this other computer, it seems that doing things with theta or phi (other than loading them) produces an error, e.g.,:

julia> using Knet; import JLD; theta, phi = JLD.load("myfile.jld", "theta", "phi");

julia> theta
4-element Array{KnetArray{Float32,N} where N,1}:
Error showing value of type Array{KnetArray{Float32,N} where N,1}:
ERROR: UndefVarError: lib not defined
Stacktrace:
...

julia> VAE.decode(theta, zeros(Float32, 100))
ERROR: MethodError: no method matching *(::KnetArray{Float32,2}, ::Array{Float32,1})
...

julia> theta[1]
500×100 KnetArray{Float32,2}:
Error showing value of type KnetArray{Float32,2}:
ERROR: UndefVarError: lib not defined
Stacktrace:
...

julia> theta[1][1, 1]
ERROR: UndefVarError: lib not defined
Stacktrace:
...

Is there any way for me to use what I saved in the JLD file to encode and decode on the machine without the GPU? Did I really just save a bunch of pointers to GPU memory without actually saving the data? What is the best way to save these results so that I can then do things like encode/decode on another machine?

BTW, If I do the training on the machine without the GPU, then the above code works fine, but the types of things are different. E.g., typeof(theta) == Array{Array{Float32,N} where N,1} when trained on the machine without the GPU, whereas typeof(theta) == Array{KnetArray{Float32,N} where N,1} when trained on the machine with the GPU.

omalled · March 27, 2019, 4:06pm

I thought I’d answer my own question in case anyone finds it helpful in the future. Knet has its own mechanism for File IO. Basically, I replaced my calls to JLD.save and JLD.load with identical calls to Knet.save and Knet.load. After that, everything worked smoothly.

DoktorMike · March 27, 2019, 4:52pm

I always move everything back to CPU after training on a GPU, before saving the networks which I guess is what knet does behind the scenes.

Topic		Replies	Views
Simple Knet.jl Question: save/load model? Machine Learning knet	2	1861	January 6, 2018
How does GPU programming work (Knet example)? Machine Learning	5	1018	January 6, 2020
How to load a "jld" file at julia 1.0.3? General Usage question	3	2011	February 28, 2019
Reading from a KNetArray within a GPU kernel function GPU question , knet	0	495	March 9, 2020
Saving and loading model with Flux.jl General Usage cuda , flux , jld2	2	108	August 18, 2024

Knet and JLD question

Related topics