Flux: Scalar getindex error

I am trying to do optimisation using Flux’s logitbinarycrossentropy function on a fully-convolutional network on GPU:

Flux.train!(loss, params(UNet_model), train_batch, optimiser)

Where loss is defined as:

loss(x, y) = mean(logitbinarycrossentropy.(model(x) |> gpu, y))

However, I’m getting a ERROR: LoadError: scalar getindex is disallowed error. I have verified that my model runs correctly (returns right output, etc), and that the loss function is able to compute a value. I tried rewriting logitbinarycrossentropy as a custom loss function but it still fails with the same error in the line update!(opt, ps, gs). I also tried using σ with binarycrossentropy instead of logitbinarycrossentropy, and got the same error.

I am using Julia 1.3.0, Flux 0.10.3, Zygote 0.4.6.

I speculate that this might have something to do with my use of cat in the model definition, is this likely?

Here is roughly how I defined my model:

function (t::test)(x)
    enc1 = t.conv_block1[1](x)
    bn = t.bottle(enc1)
    dec1 = t.upconv_block[1](bn)
    dec1 = cat(dims=3, dec1, enc1)
    dec1 = t.conv_block[2](dec1)
    dec1 = t.conv(dec1)
end

Thanks!

Full stacktrace:

ERROR: LoadError: scalar getindex is disallowed
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] assertscalar(::String) at C:\Users\CCL\.julia\packages\GPUArrays\1wgPO\src\indexing.jl:14
 [3] getindex at C:\Users\CCL\.julia\packages\GPUArrays\1wgPO\src\indexing.jl:54 [inlined]
 [4] _getindex at .\abstractarray.jl:1004 [inlined]
 [5] getindex at .\abstractarray.jl:981 [inlined]
 [6] hash(::CuArray{Float32,4,Nothing}, ::UInt64) at .\abstractarray.jl:2203
 [7] hash at .\hashing.jl:18 [inlined]
 [8] hashindex at .\dict.jl:168 [inlined]
 [9] ht_keyindex(::Dict{Any,Any}, ::CuArray{Float32,4,Nothing}) at .\dict.jl:282
 [10] get(::Dict{Any,Any}, ::CuArray{Float32,4,Nothing}, ::Nothing) at .\dict.jl:500
 [11] (::Zygote.var"#876#877"{Zygote.Context,IdDict{Any,Any},CuArray{Float32,4,Nothing}})(::Nothing) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\lib\base.jl:44
 [12] (::Zygote.var"#2375#back#878"{Zygote.var"#876#877"{Zygote.Context,IdDict{Any,Any},CuArray{Float32,4,Nothing}}})(::Nothing) at C:\Users\CCL\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49
 [13] #fmap#53 at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\functor.jl:37 [inlined]
 [14] (::typeof(∂(#fmap#53)))(::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface2.jl:0
 [15] fmap at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\functor.jl:36 [inlined]
 [16] (::typeof(∂(fmap)))(::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface2.jl:0
 [17] gpu at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\functor.jl:108 [inlined]
 [18] (::typeof(∂(gpu)))(::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface2.jl:0
 [19] |> at .\operators.jl:854 [inlined]
 [20] (::typeof(∂(|>)))(::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface2.jl:0
 [21] (::typeof(∂(loss)))(::Float32) at C:\Users\CCL\fcn\main_flux2.jl:47
 [22] #157 at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\lib\lib.jl:156 [inlined]
 [23] #297#back at C:\Users\CCL\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:49 [inlined]
 [24] #17 at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\optimise\train.jl:88 [inlined]
 [25] (::typeof(∂(λ)))(::Float32) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface2.jl:0
 [26] (::Zygote.var"#38#39"{Zygote.Params,Zygote.Context,typeof(∂(λ))})(::Float32) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface.jl:101
 [27] gradient(::Function, ::Zygote.Params) at C:\Users\CCL\.julia\packages\Zygote\oMScO\src\compiler\interface.jl:47
 [28] macro expansion at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\optimise\train.jl:87 [inlined]
 [29] macro expansion at C:\Users\CCL\.julia\packages\Juno\f8hj2\src\progress.jl:134 [inlined]
 [30] #train!#12(::Flux.Optimise.var"#18#26", ::typeof(Flux.Optimise.train!), ::typeof(loss), ::Zygote.Params, ::Array{Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}},1}, ::ADAM) at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\optimise\train.jl:80
 [31] train!(::Function, ::Zygote.Params, ::Array{Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}},1}, ::ADAM) at C:\Users\CCL\.julia\packages\Flux\NpkMm\src\optimise\train.jl:78
 [32] top-level scope at C:\Users\CCL\fcn\main_flux2.jl:83
 [33] include at .\boot.jl:328 [inlined]
 [34] include_relative(::Module, ::String) at .\loading.jl:1105
 [35] include(::Module, ::String) at .\Base.jl:31
 [36] exec_options(::Base.JLOptions) at .\client.jl:287
 [37] _start() at .\client.jl:460
in expression starting at C:\Users\CCL\fcn\main_flux2.jl:74

I am not an expert, I am still learnig Flux, but the problem is that using a GPU not all operations are well-adapted to a GPU. I think it is the loss function, because it seems a little strange for me (I never used the mean function).

Has you tried:

loss(x, y) = logitbinarycrossentropy(model(x), y)

The casting to GPU must bedefine first, not in every run of loss:

model(x)= … |> gpu

It seems very strange to me the definition of the model, specially using the cat function. Could you give the description of your model to put it in a more “Flux” style?

Thanks for the reply.

mean is used with logitbinarycrossentropy because this loss function returns an array.

Casting to GPU is done via model = FCN() |> gpu, what model(x) |> gpu does is cast the inferred result to GPU, not the model.

Because of the encoder-decoder structure of FCNs, cat is needed as opposed to in vanilla CNN models. I’m using a struct with Flux, where conv_block, bottle etc call Flux chains.

I don’t think the current (logit)binarycrossentropy in Flux is very GPU friendly (bitrot and not keeping up with CuArrays/Zygote releases). Perhaps give the versions in https://github.com/FluxML/Flux.jl/pull/1150 a try?

I would guess that since you call the gpu function inside the loss function Zygote tries to differentiate it and fails. The code is most likely anywas not correct as you are making the forward pass on model as is, then transfer the output to the GPU.

Have you tried something like:

model = model |> gpu

loss(x,y) = mean(logitbinarycrossentropy.(model(x), y))

This is what I’ve been doing: model = FCN() |> gpu and I then call model when training.

Thanks, I’ll give it a try. I’ve tried upgrading to the master branch version (same error) and tried defining my own custom logitbinarycrossentropy function (same error). I’m guessing it might have something to do with cat? Not sure really

I would try updating to julia 1.4.1 and Flux 0.10.4 (this should update Zygote and CuArrays too). That should make repro’ing any issues much simpler.

Also, I had a look at the stack trace and it seems like a dict is being indexed somewhere? Do you mind posting your UNet model and/or a minimal working example? I’ve been able to use both cat and UNet.jl on the latest stable Flux/CuArrays, so it seems like something else is up.

But if model is already in the gpu then why do you want that model(x) |> gpu call in the loss? The output of the model is already in the gpu.

Have you tried removing it?

Here is what I read out from the stack trace:

Zygote is trying to compute the gradient of loss:

 [21] (::typeof(∂(loss)))(::Float32) at C:\Users\CCL\fcn\main_flux2.jl:47

To calculate the gradient of loss Zygote needs to compute the gradient of first |> and then gpu:

 [20] (::typeof(∂(|>)))(::CuArray{Float32,4,Nothing}) at 
 [18] (::typeof(∂(gpu)))(::CuArray{Float32,4,Nothing}) at 

In the same manner, to compute the gradient of gpu, Zygote needs to compute the gradient of fmap:

 [16] (::typeof(∂(fmap)))(::CuArray{Float32,4,Nothing}) at 

fmap is the function used to map all found parameters using some function (cu in this case). It uses an IdDict as a cache to avoid collecting the same parameters twice or from looping idefinitely for things which happen to be recursive.

You hit some code which Zygote can’t differentiate which relates to putting or getting things in that IdDict before you hit some other undifferentiable function like CUDAmemcopy or whatever.

tl;dr: I dont think that stacktrace is caused by any problem inside model.

Have you removed the |> gpu after model inside the loss?

If you are still getting an error after removing the |> gpu inside the loss it is most likely a different error than the one you posted above. It would then be useful to post that new stacktrace so people can help you further.

1 Like

I originally wrote the program without passing to GPU inside loss but already had the same error, this is simply one of the things I tried during debugging, I should’ve been more clear. Here is the new stacktrace (after upgrading to the master branch of Flux), which is identical (checked) both with & without passing to gpu in loss:

ERROR: LoadError: TaskFailedException:
scalar getindex is disallowed
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] assertscalar(::String) at C:\Users\CCL\.julia\packages\GPUArrays\OXvxB\src\host\indexing.jl:41
 [3] _getindex at C:\Users\CCL\.julia\packages\GPUArrays\OXvxB\src\host\indexing.jl:96 [inlined]
 [4] getindex at .\abstractarray.jl:981 [inlined]
 [5] im2col!(::CuArray{Float32,2,CuArray{Float32,3,Nothing}}, ::CuArray{Float32,4,CuArray{Float32,5,CuArray{Float32,4,Nothing}}}, ::DenseConvDims{3,(3, 3, 1),3,32,(1, 1, 1),(1, 1, 1, 1, 0, 0),(1, 1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\impl\conv_im2col.jl:231
 [6] macro expansion at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\impl\conv_im2col.jl:53 [inlined]
 [7] (::NNlib.var"#343#threadsfor_fun#160"{CuArray{Float32,3,Nothing},Float32,Float32,CuArray{Float32,5,CuArray{Float32,4,Nothing}},CuArray{Float32,5,CuArray{Float32,4,Nothing}},Array{Float32,5},DenseConvDims{3,(3, 3, 1),3,32,(1, 1, 1),(1, 1, 1, 1, 0, 0),(1, 1, 1),false},Int64,Int64,Int64,UnitRange{Int64}})(::Bool) at .\threadingconstructs.jl:61
 [8] (::NNlib.var"#343#threadsfor_fun#160"{CuArray{Float32,3,Nothing},Float32,Float32,CuArray{Float32,5,CuArray{Float32,4,Nothing}},CuArray{Float32,5,CuArray{Float32,4,Nothing}},Array{Float32,5},DenseConvDims{3,(3, 3, 1),3,32,(1, 1, 1),(1, 1, 1, 1, 0, 0),(1, 1, 1),false},Int64,Int64,Int64,UnitRange{Int64}})() at .\threadingconstructs.jl:28
Stacktrace:
 [1] wait(::Task) at .\task.jl:251
 [2] macro expansion at .\threadingconstructs.jl:69 [inlined]
 [3] #conv_im2col!#159(::CuArray{Float32,3,Nothing}, ::Float32, ::Float32, ::typeof(NNlib.conv_im2col!), ::CuArray{Float32,5,CuArray{Float32,4,Nothing}}, ::CuArray{Float32,5,CuArray{Float32,4,Nothing}}, ::Array{Float32,5}, ::DenseConvDims{3,(3, 3, 1),3,32,(1, 1, 1),(1, 1, 1, 1, 0, 0),(1, 1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\impl\conv_im2col.jl:49
 [4] conv_im2col! at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\impl\conv_im2col.jl:30 [inlined]
 [5] #conv!#41 at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:53 [inlined]
 [6] conv!(::CuArray{Float32,5,CuArray{Float32,4,Nothing}}, ::CuArray{Float32,5,CuArray{Float32,4,Nothing}}, ::Array{Float32,5}, ::DenseConvDims{3,(3, 3, 1),3,32,(1, 1, 1),(1, 1, 1, 1, 0, 0),(1, 1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:53
 [7] #conv!#48(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(conv!), ::CuArray{Float32,4,Nothing}, ::CuArray{Float32,4,Nothing}, ::Array{Float32,4}, ::DenseConvDims{2,(3, 3),3,32,(1, 1),(1, 1, 1, 1),(1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:70
 [8] conv! at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:70 [inlined]
 [9] #conv#89(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(NNlib.conv), ::CuArray{Float32,4,Nothing}, ::Array{Float32,4}, ::DenseConvDims{2,(3, 3),3,32,(1, 1),(1, 1, 1, 1),(1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:116
 [10] conv(::CuArray{Float32,4,Nothing}, ::Array{Float32,4}, ::DenseConvDims{2,(3, 3),3,32,(1, 1),(1, 1, 1, 1),(1, 1),false}) at C:\Users\CCL\.julia\packages\NNlib\FAI3o\src\conv.jl:114
 [11] #adjoint#1833 at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\lib\nnlib.jl:26 [inlined]
 [12] adjoint at .\none:0 [inlined]
 [13] _pullback at C:\Users\CCL\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:47 [inlined]
 [14] Conv at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\layers\conv.jl:137 [inlined]
 [15] _pullback(::Zygote.Context, ::Conv{2,4,typeof(identity),Array{Float32,4},Array{Float32,1}}, ::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [16] applychain at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\layers\basic.jl:36 [inlined]
 [17] _pullback(::Zygote.Context, ::typeof(Flux.applychain), ::Tuple{Conv{2,4,typeof(identity),Array{Float32,4},Array{Float32,1}},BatchNorm{typeof(relu),Array{Float32,1},Array{Float32,1},Float32},Conv{2,4,typeof(identity),Array{Float32,4},Array{Float32,1}},BatchNorm{typeof(relu),Array{Float32,1},Array{Float32,1},Float32}}, ::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [18] Chain at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\layers\basic.jl:38 [inlined]
 [19] _pullback(::Zygote.Context, ::Chain{Tuple{Conv{2,4,typeof(identity),Array{Float32,4},Array{Float32,1}},BatchNorm{typeof(relu),Array{Float32,1},Array{Float32,1},Float32},Conv{2,4,typeof(identity),Array{Float32,4},Array{Float32,1}},BatchNorm{typeof(relu),Array{Float32,1},Array{Float32,1},Float32}}}, ::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [20] fcn at C:\Users\CCL\fcn\fcn_flux2.jl:41 [inlined]
 [21] _pullback(::Zygote.Context, ::fcn, ::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [22] loss at C:\Users\CCL\fcn\main_flux2.jl:47 [inlined]
 [23] _pullback(::Zygote.Context, ::typeof(loss), ::CuArray{Float32,4,Nothing}, ::CuArray{Float32,4,Nothing}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [24] adjoint at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\lib\lib.jl:168 [inlined]
 [25] _pullback at C:\Users\CCL\.julia\packages\ZygoteRules\6nssF\src\adjoint.jl:47 [inlined]
 [26] #17 at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\optimise\train.jl:89 [inlined]
 [27] _pullback(::Zygote.Context, ::Flux.Optimise.var"#17#25"{typeof(loss),Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}}}) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface2.jl:0
 [28] pullback(::Function, ::Zygote.Params) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:174
 [29] gradient(::Function, ::Zygote.Params) at C:\Users\CCL\.julia\packages\Zygote\YeCEW\src\compiler\interface.jl:54
 [30] macro expansion at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\optimise\train.jl:88 [inlined]
 [31] macro expansion at C:\Users\CCL\.julia\packages\Juno\f8hj2\src\progress.jl:134 [inlined]
 [32] #train!#12(::Flux.Optimise.var"#18#26", ::typeof(Flux.Optimise.train!), ::typeof(loss), ::Zygote.Params, ::Array{Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}},1}, ::ADAM) at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\optimise\train.jl:81
 [33] train!(::Function, ::Zygote.Params, ::Array{Tuple{CuArray{Float32,4,Nothing},CuArray{Float32,4,Nothing}},1}, ::ADAM) at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\optimise\train.jl:79
 [34] top-level scope at C:\Users\CCL\fcn\main_flux2.jl:84
 [35] include at .\boot.jl:328 [inlined]
 [36] include_relative(::Module, ::String) at .\loading.jl:1105
 [37] include(::Module, ::String) at .\Base.jl:31
 [38] exec_options(::Base.JLOptions) at .\client.jl:287
 [39] _start() at .\client.jl:460
in expression starting at C:\Users\CCL\fcn\main_flux2.jl:74

This is a very different error. If you compare the stack traces they show completely different things.

I think this points to the source of the new error:

 [10] conv(::CuArray{Float32,4,Nothing}, ::Array{Float32,4}, ::DenseConvDims{2,(3, 3),3,32,(1, 1),(1, 1, 

Your model is attempting to do convolution between an array on the gpu (the CuArray) and one which is still in the cpu (the Array). I think it would have been nicer if this generated an error message right away, but I guess the library doesn’t want to be prejudicial and attempts the operation anyways, leading to a low level error.

Looking at this line:

[14] Conv at C:\Users\CCL\.julia\packages\Flux\8xjH2\src\layers\conv.jl:137 [inlined]

I see that the second argument to the conv function is the weights.

It seems like despite the fact that you did model FCN() |> gpu, the weights of the first conv layer (and possibly others too) where not transferred to the gpu.

The gpu function is only half-magical, so if you happen to have wrapped the Chain or any of its layers in other functions or structs (like the test struct in your mwe) you need to make sure the function Flux.functor is defined for each one of those structs. See here and here.

1 Like

Thank you, you were spot on! I passed the model chains (wrapped in structs and functions) to GPU (defined in another file) and it finally worked! :blush: I definitely need more practice with stacktraces!

Great!

I guess Zygote does its fair share of work obfuscating the stack traces by making all lines look the same (adjoint and _pullback on almost every line), but once you learn to see through that it is not too bad.

1 Like