How to save results for each outcome?

I am trying to save results for each outcome.
For example,

list = [:A, :B]
for i in 1:length(list)
    result$i = lm((@eval @formula($i ~ x1 + x2)), data)
end

So that I can save the results for each outcome like result1, result2, … ,
then export each result to csv file.
Any ideas? Thank you :slight_smile:

In Julia, we don’t have the concept of just indexing into a data structure at an index, and both data structure and index location are created if they don’t exist. That’s what you’re doing in your example, which is something like R I think?

You can either make an empty vector first, and then push! into it:

v = []
for i in 1:10
    push!(v, rand())
end

Above I used an untyped vector, Vector{Any}. That’s usually not good for performance, if your loop has a lot of runs. Then in this example it would be better to explicitly make an empty vector of floats.

v = Float64[]
for i in 1:10
    push!(v, rand())
end

But for this you need to know the element type you’ll get, which is sometimes not easy. For example I don’t know what the resulting type of your lm call would be. So my preferred solution is usually map, which goes over the elements of one or many iterables and automatically stores the result of a function called with the elements:

With your example:

list = [:A, :B]
result = map(1:length(list)) do i
    lm((@eval @formula($i ~ x1 + x2)), data)
end

The do i means we’re passing a function to map as the first argument, before 1:length(list). This function will be called for every element in 1:length(list) and the current number will be the argument named i inside that function. It takes a moment to wrap your head around if you haven’t seen it, but it’s a really convenient syntax used widely in Julia wherever you want to pass a function first.

4 Likes

I see you’re still using @eval to construct multiple regressions rather than terms… :wink:

I think the main issue that you will have to deal with is how to turn your regression result into a tabular format, given that you want to write out the results to csv. I assume at a minimum you’d want your table to have (1) dependent variable and (2) estimated coefficients on your covariates. In that case you can do:

julia> using DataFrames

julia> list = [:A, :B];

julia> results = DataFrame("y" => list, (["x1", "x2"] .=>  [Vector{Float64}(undef, length(list)) for _ ∈ 1:2])...)
2Γ—3 DataFrame
 Row β”‚ y       x1            x2           
     β”‚ Symbol  Float64       Float64      
─────┼────────────────────────────────────
   1 β”‚ A       1.53e-322     5.77183e-312
   2 β”‚ B       6.95011e-310  0.0

You can then go through and fill your dataframe in a loop (this is pseudocode but should be more or less correct):

julia> for depvar ∈ list
           ols_result = lm(term(depvar) ~ term(:x1) + term(:x2), data)
           results[results.y .== devpar, :x1] = coef(ols_result)[1]
           results[results.y .== depvar, :x2] = coef(ols_result)[2]
       end
2 Likes

@nilshg
Wow this is very helpful. Thank you!
I won’t use @eval anymore :sweat_smile:
I just have a quick question.
When I run this part,

results = DataFrame( ...

I get an error saying no method matching DataFrame.
Also, could you explain what the underscore means here? for _ in 1:2
Thank you!

Thanks for your kind explanation!
For the my example part,
Should I replace $i with list[i]?

Ah I see, you wanted the symbols from list there. In that case, instead of interpolating i, you would interpolate list[i], therefore the expression must be $(list[i]) instead of $i. But you can have it even easier if you map over the symbols directly.

list = [:A, :B]
result = map(list) do sym
    lm((@eval @formula($sym ~ x1 + x2)), data)
end
1 Like

Thank you very much! :smile:

Sure - my example above was incomplete, as you need using DataFrames. If you don’t want to use DataFrames, you could also construct something similar based on NamedTuples, which can also be written to CSV.

I also realise I missed out the all important CSV.write("results.csv", results) as the final step after the loop.

Finally, on the _ that’s just a placeholder for the variable that holds the value of the current iteration (here, either 1 or 2). People tend to use _ to signify that the value is actually not used, i.e. you could use any valid identifier here and it wouldn’t change the output of the list comprehension (as we’re creating a Vector{Float}(undef, length(list)) in every iteration step).

2 Likes

Thank you for your clarification!
The first part of your codes works perfectly now.
The second part,

for depvar ∈ list
    ols_result = lm(term(depvar) ~ term(:x1) + term(:x2), data)
    results[results.y .== depvar, :x1] = coef(ols_result)[1]
    results[results.y .== depvar, :x2] = coef(ols_result)[2]
end

I get this error.

ERROR: MethodError: no method matching setindex!(::DataFrame, ::Float64, ::BitArray{1}, ::Symbol)

What could be the problem?
Thank you!

Sorry my fault for writing pseudocode - the DataFrame indexing in the loop returns a one-element array on the left hand side, so assignment needs to be broadcasted.

Here’s a full MWE (note the .= in the loop):

julia> using CSV, DataFrames, GLM

julia> data = DataFrame(x1 = rand(500), x2 = rand(500));

julia> data[!, :A] = 5 .+ 0.5*data.x1 - 1.5*data.x2 .+ randn.();

julia> data[!, :B] = 1 .+ 2.5*data.x1 + 3.5*data.x2 .+ randn.();

julia> list = [:A, :B];

julia> results = DataFrame("y" => list, (["x1", "x2"] .=>  [Vector{Float64}(undef, length(list)) for _ ∈ 1:2])...);

julia> for depvar ∈ list
           ols_result = lm(term(depvar) ~ term(:x1) + term(:x2), data)
           results[results.y .== depvar, :x1] .= coef(ols_result)[1]
           results[results.y .== depvar, :x2] .= coef(ols_result)[2]
       end

julia> results
2Γ—3 DataFrame
 Row β”‚ y       x1        x2       
     β”‚ Symbol  Float64   Float64  
─────┼────────────────────────────
   1 β”‚ A       5.17577   0.495342
   2 β”‚ B       0.975703  2.6233
1 Like

Oh! I just needed a dot (.) before the equal sign (=). Now it works!
Thank you so much for your time and explanation.
I learned a lot ! :smile:

3 Likes