I have a nested for loop, which should result in a vector of vectors. Unfortunately, I’m only getting the end result, rather than each iteration. I’m probably missing something obvious about nested for loops; minimum (not actually) working example below.
using DataFrames
# Replicate Minimum Working Data
#generate DataFrame names
dfnames = ["a", "b", "c", "d", "e", "f", "g", "h", "max", "years"]
#generate column Years data
years = collect(1:10)
#generate Vector DataFrames
dfvector = []
for i in years
dfvector = push!(dfvector, DataFrame(hcat(rand(10,9), years), dfnames))
end
#Identify the maximum value for the first dataframe
maxvaluesyr1 = zeros(length(years))
for i in years
maxvaluesyr1[i] = maximum(dfvector[1][dfvector[1].years .== i,:].max)
end
#Identify the maximum value for the second dataframe
maxvaluesyr2 = zeros(length(years))
for i in years
maxvaluesyr2[i] = maximum(dfvector[2][dfvector[2].years .== i,:].max)
end
The above is what I would like, but I need to loop the “for i in years” for loop over each dataframe. I tried the below example, but I kept receiving the last j loop and not the first 9. Any help is incredibly appreciated and just let me know if I can clarify further!
maxvalues = zeros(length(years))
maxvaluesvector = []
for j=1:length(dfvector)
for i in years
maxvalues[i] = maximum(dfvector[j][dfvector[j].years .== i,:].max)
end
maxvaluesvector = push!(maxvaluesvector, maxvalues)
end
maxvalues = zeros(length(years))
maxvaluesvector = []
for j=1:length(dfvector)
for i in years
maxvalues[i] = maximum(dfvector[j][dfvector[j].years .== i,:].max)
maxvaluesvector = push!(maxvaluesvector, maxvalues)
end
end
maxvalues = zeros(length(years))
maxvaluesvector = repeat([maxvalues], length(dfvector))
for j=1:length(dfvector)
for i in years
maxvalues[i] = maximum(dfvector[j][dfvector[j].years .== i,:].max)
maxvaluesvector[j] = maxes
end
end
maxvalues = zeros(length(years))
maxvaluesvector = repeat([maxvalues], length(dfvector))
for j=1:length(dfvector)
for i in years
maxvalues[i] = maximum(dfvector[j][dfvector[j].years .== i,:].max)
end
maxvaluesvector[j] = maxes
end
appends this vector maxvalues to maxvaluesvector (by the way the assignment is not necessary, you can write push!(v, ...) instead of v = push!(v, ...)).
The problem is that this always appends the same vector (the same container). You need to allocate a new vector for each j.
This is probably the issue. push! is pushing the same vector to maxvaluesvector every time, so every time you edit it you’re editing all the entries simultaneously. Here’s a simple example:
julia> results = []
Any[]
julia> x = [0]
1-element Vector{Int64}:
0
julia> for i in 1:3
x[1] = i
push!(results, x) # Pushing the *same* vector every time
end
julia> results
3-element Vector{Any}:
[3]
[3]
[3]
If you want your results to be different vectors, then you need to make that explicit. One easy way in this case is to copy when you push!:
julia> results = []
Any[]
julia> for i in 1:3
x[1] = i
push!(results, copy(x))
end
julia> results
3-element Vector{Any}:
[1]
[2]
[3]
Thanks @sijo, and @rdeits. I understand overwriting the initial vector is the expected behaviour and I expected that too. I didn’t expect it wouldn’t append the vector for each iteration of the outer loop, which could be thought of as each total inner loop.
I thought nested for loops followed logic like:
Iterate over each inner loop value [i] and store the results e.g. in a vector for the first outer loop value [j]
push! would store that first total inner loop given the first outer loop value [j]
The outer loop value [j] changes to [j] + 1 for example and the inner loop repeats and overwrites the initial storage (i.e. the first vector).
Push! then appends the overwritten vector to the outer loop vector, which becomes a vector of vectors.
It’s clear the logic is false, but can you help me understand where? If my thoughts aren’t clear above, just let me know how I can clarify!
The problems is: the overwritten vector that you are saving is always the same object, not a new object. You have two solutions:
Make the line maxvalues = zeros(length(years)) the first line of the outer loop, so a new object is created each time.
Change the last line of the outer loop to push!(maxvaluesvector, copy(maxvalues)) so you save a copy of the vector.
If you do not do either, what happens is that all positions of maxvaluesvector refer to the same object, that is being changed until the last iteration. You can check this by making a new change (setting the first element to zero for example) to one of the vectors inside maxvaluesvector and see that this change is reflected in all inner vectors instead of just that position.