Reading from a KNetArray within a GPU kernel function

Is it possible to use KNetArrays from within a custom GPU kernel function? The following MWE fails:

using Knet 
using GPUArrays

knArray = KnetArray(rand(4,4))

function kernel(state, knArray)
    temp = knArray[1]
    return
end

gpu_call(kernel, CuArray(knArray), (knArray,), 1)
Stacktrace:
 [1] _nextind_str at strings/string.jl:141
 [2] nextind at strings/string.jl:137
 [3] _nextind_str at strings/string.jl:141
 [4] _split at strings/util.jl:325
 [5] env_override_minlevel at logging.jl:419
 [6] current_logger_for_env at logging.jl:383
 [7] #find_library#1 at <home>/.julia/packages/CUDAapi/K94wY/src/discovery.jl:37
 [8] find_cuda_library at <home>/.julia/packages/CUDAapi/K94wY/src/discovery.jl:184
 [9] getErrorString at <home>/.julia/packages/Knet/LjPts/src/gpu.jl:350
 [10] _unsafe_copy! at <home>/.julia/packages/Knet/LjPts/src/karray.jl:347
 [11] kernel at <home>/Projects/ML/interface-reconstruction/ML/mwe.jl:7

Following the stacktrace, I see the following call is made when I try to read from the array:

@cudart(cudaMemcpy,(Cptr,Cptr,Csize_t,UInt32),
          pointer(dest,doffs), pointer(src,soffs), n*sizeof(T), 2)

I’m not exactly sure what this is doing, but it looks to me like it may be trying to call a separate kernel to copy the value, which would explain why it’s failing.

I suspect this is a misuse of KNetArrays on my part, but I can’t figure out how to get around the need to write a custom kernel in my particular case, and I need it to operate on the data in a KNetArray. The kernel is part of the final layer of a CNN.

Thanks for any help you can give me.