I wish I could implement one of my Python models on Julia, but have been stuck for hours on the basic iteration problem in the context of the Julia language.
Basically, I just want to iterate over each row of my DataFrame
#Step 1: declaration of endogenous variables
columnnames = ["A","B"]
T = 100
columns = [Symbol(col) => zeros(T) for col in columnnames]
y = DataFrame(columns...)
#I am launching my iteration
for t in 1:T
if t == 0
#Step 2: Initial values are assigned
y[1] = 1
else
#Step 3: equations
y[t] = y[t-1] + 1
No matter how hard I search through the different tutorials, I can’t find the solution to do this simple approach on Julia.
I tried the following solution in particular:
Even the first step to replace the first line with a value doesn’t work…
y[:1,:] = 1.0
ERROR: MethodError: no method matching setindex!(::DataFrame, ::Float64, ::Int64, ::UnitRange{Int64})
Would you have a suggestion in my research please?
Seems to be very easy but your problem description seems to be overly complicated.
Even if I think the the answer you are looking for is very simple (I would give it if I would be sure which one it is), I think it is best if you first go through this
and ask again.
Or, just skip the python and Country Code stuff and just ask what you want to do as a first step in Julia. We can go step by step until you are on the road…
Well done.
Here is a slightly more Julia style version of the iteration (only changing column “A”). This part is still unclear, which column you want to change:
using DataFrames
t = 100
y = DataFrame("A"=>ones(t),"B"=>zeros(t))
for t in 2:t
y[t,1] = y[t-1,1] + 1
end
I am not going into high efficiency, just more tutorial style and easy to comprehend.
Note: Uppercase is style for types, variables should be lowercase starting.
using DataFrames
n = 100
y = DataFrame("A"=>zeros(n),"B"=>zeros(n))
for t in 1:n
if t == 1
y[t,1:end] .= 1
else
y[t,1:end] .= ( y[t-1,1:end] .+ 1 )
end
end
Which gives the error:
ERROR: ArgumentError: broadcasting over `DataFrameRow`s is reserved
using DataFrames
n = 100
y = DataFrame("A"=>zeros(n),"B"=>zeros(n))
for t in 1:n
if t == 1
y[t,1:end] .= 1
else
y[t,1:end] .= ( Vector(y[t-1,1:end]) .+ 1 )
end
end
But I am not happy with this code. Depending on your real goal it is probably better just to do the processing for each column separately, as the columns seem to be independent from each other (but as I said, real peformance implementation needs the complete problem to know).
This is indeed a sub-optimal scenario, but your code looks good.
The reason you have to convert to vectors is because a DataFrameRow tries to have a very similar API as a NamedTuple. NamedTuples currently do not support this kind of broadcasting, and we want to match that behavior for whatever they do eventually decide to do with broadcasting.
I would encourage you to post a more complete description of what you’re actually trying to achieve in order to avoid the danger of causing an XY problem.
In particular, it feels to me like a DataFrame isn’t necessarily the right data structure for your use case - just because something was done in pandas doesn’t mean it has to be a DataFrame in Julia! You might be better off with a simple Array{Float64, 2}, or maybe a NamedArray, or one of the many other low- or zero cost abstractions the Julia language offers to organise your data & algorithm.
using NamedArrays
columnsnames = ["A","B"]
c = length(columnsnames)
n = 100
years = zeros(n)
start_date = 2020
years[1] = start_date
for t in 2:n
years[t] = years[t-1] + 1
end
y = NamedArray((zeros(n,c)), (years, columnsnames))
for t in 1:n
if t == 1
y[t,1:end] .= 1
else
y[t,1:end] .= y[t-1,1:end] .+ 1
end
end
println(y)