Unexpected behavior while creating an array of arrays


#1

Below I have a simplified example of a function that creates an array of arrays in a loop. A while back code similar to this tripped me up because I was unwittingly creating references in each component array. However, this is not the case with the code below. It does not create references and it does not store the unique values on each loop iteration.

function fun(Ntrials)
    space = linspace(0,.9,5)
    Nq = size(space,1)-1
    pred = fill(0.0,Nq)
    output = [fill(0.0,Nq) for t = 1:Ntrials]
    for trial = 1:Ntrials
        x = rand(100)
        for q = 1:Nq
            pred[q] = mean(x .<= space[q+1]) - mean(x .<= space[q])
        end
        println("pred     ",pred)
        pred = @. log(max(pred,.10^10))
        output[trial] = pred
        println("output[1] ",output[1])
        println()
    end
    return output
end

x = fun(4)

#print out
pred [0.25, 0.23, 0.24, 0.22]
output[1] [-1.38629, -1.46968, -1.42712, -1.51413]

pred     [0.17, 0.2, 0.26, 0.28]
output[1] [0.17, 0.2, 0.26, 0.28]

pred     [0.3, 0.15, 0.25, 0.22]
output[1] [0.17, 0.2, 0.26, 0.28]

pred     [0.19, 0.2, 0.26, 0.28]
output[1] [0.17, 0.2, 0.26, 0.28]

dump(x)
#print out

Array{Array{Float64,1}}((4,))
  1: Array{Float64}((4,)) [0.17, 0.2, 0.26, 0.28]
  2: Array{Float64}((4,)) [0.3, 0.15, 0.25, 0.22]
  3: Array{Float64}((4,)) [0.19, 0.2, 0.26, 0.28]
  4: Array{Float64}((4,)) [-1.66073, -1.60944, -1.34707, -1.27297]

However, if I remove pred = @. log(max(pred,.10^10)), it is true that x[1] === x[2]. I can’t imagine this is intended behavior. Does anyone know what is going on?


#2

You are putting the same pred into output.


#3

Yes. However, I was expecting each array in output to have the last value of pred. Why does it not do that?


#4

If you remove the line in question, pred is always the same array (not the same content, because that changes after each iteration, but the same container).


#5

See eg
http://www.johnmyleswhite.com/notebook/2014/09/06/values-vs-bindings-the-map-is-not-the-territory/

This should probably be a FAQ


#6

Thanks for the link. I re-read it, but this particular case still does not make sense. Sorry if I am being dense. It just strikes me as odd that on the first loop it produces output[1] [-1.38629, -1.46968, -1.42712, -1.51413] and output[1] [0.17, 0.2, 0.26, 0.28] on every loop thereafter. I figured each array in output would be the same. Specifically, in my example I expected each array to be Array{Float64}((4,)) [-1.66073, -1.60944, -1.34707, -1.27297]

When I set output[trial] = @. log(max(pred,.10^10)), that seems to produce a different arrays with the correct calculations, as expected.


#7

pred is just a name (binding). You are effectively making an array that looks like

[pred, pred, pred]

so each entry refers to the same object.
(Disclaimer: I haven’t read the code in detail.)


#8

Yes. What you have in your example makes sense. However, what I have in my example is something more like [pred1,pred2,…].

I think I figured out the issue: pred = @. log(max(pred,.10^10)) creates a new binding of pred on each iteration. That prevents it from creating [pred,pred…]. Sorry for the noise!


#9

Yes, exactly.