Looping over variables names properly

roi.holtzman · May 4, 2022, 8:28am

I have a few arrays, say x, y, z, and I want to take all of their means and put them into new variables as x_mean = mean(x).
Is there a way to do this in a for loop? Something like

for a in ["x", "y", "z"]
    $a_mean = mean($a)
end

GunnarFarneback · May 4, 2022, 8:40am

You can do it with @eval:

julia> using Statistics: mean

julia> x, y, z = rand(3), rand(3), rand(3);

julia> for a in [:x, :y, :z]
           @eval $(Symbol(a, "_mean")) = mean($a)
       end

julia> x_mean
0.7142450574816145

Most of the time this is not a great idea though and you would be better off organizing your data differently, but it depends on how you are using it.

baggepinnen · May 4, 2022, 9:03am

You could build something like a Dict that maps variable names to their mean

julia> d = Dict(name=>mean(val) for (name, val) in pairs((; x,y,z)))
Dict{Symbol, Float64} with 3 entries:
  :y => 0.189552
  :z => 0.744915
  :x => 0.551346

rocco_sprmnt21 · May 5, 2022, 4:26pm

It looks like exactly what you do using packages like DataFrames or InMemoryDataset

ulia> df=DataFrame(x=rand(-50:50,10),y=rand(-5:5,10),z=rand(-10:10,10))
10×3 DataFrame
 Row │ x      y      z     
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │   -35     -3      6
   2 │    41      1     -6
   3 │    34     -3     -7
   4 │    13      2      9
   5 │   -48      5      2
   6 │    43     -2      4
   7 │    38      3      4
   8 │   -12      0     -9
   9 │   -45      3     -1
  10 │     4     -1      8

julia> Tables.columntable(combine(df,Cols(:).=>mean))
(x_mean = [3.3], y_mean = [0.5], z_mean = [1.0])


julia> copy(combine(df,Cols(:).=>mean)[1,:])
(x_mean = 3.3, y_mean = 0.5, z_mean = 1.0)

Henrique_Becker · May 5, 2022, 7:54pm

This is almost never a good idea, and it will only work in global scope. In local scope, this code will start creating global variables.

rafael.guerra · May 5, 2022, 8:50pm

One could also use named tuples:

x, y, z = rand(3), rand(3), rand(3)
dmean = map(mean, (;x, y, z))

# and then access the results with:
dmean.x, dmean.y, dmean.z

NB:
Learned this from the excellent book in progress: Julia for Data Analysis, by Bogumił Kamiński

roi.holtzman · June 30, 2022, 7:10pm

So what would be a good idea?

DNF · June 30, 2022, 7:35pm

Isn’t this one of those cases where the right answer is: “Don’t do it.” Or maybe “Don’t do this, but if you must know here’s a way”? (Personally, I’m a big fan of telling people their question is wrong, rather than providing an answer )

Use a Dict or maybe a Vector. Don’t dynamically create variable names, it makes your code unreadable.

roi.holtzman · June 30, 2022, 7:47pm

But actually, naming the variables does make my code readable, in my opinion.

I mean, I want to save some moments of some variables, x, y, z.
I have a vector named x of size, say, 4, to keep all its moments.
I have the same for y and z.
Now, I want to take the new observables I measure stored in x_tmp, y_tmp, z_tmp, and put them into x, y, z.

I thought about doing something like this.
Do you think it is more readable to type the entire thing?

x = zeros(4)
y = zeros(4)
z = zeros(4)

x_tmp = 1
y_tmp = 2
z_tmp = 3

for a in [:x, :y, :z]
  for i = 1:
    eval(a)[i] += eval(Symbol(a, '_', "tmp"))^i
  end
end

I must say that I do think I have an issue with the scope of variables, and therefore the way I am doing it now is not good. But I would like to save some on typing the same thing over and over.

Perhaps the best way is to dedicate a function for it.

GunnarFarneback · June 30, 2022, 8:12pm

I thought I kinda said that with

Most of the time this is not a great idea though and you would be better off organizing your data differently, but it depends on how you are using it.

below the example.

DNF · June 30, 2022, 8:36pm

When you write your code like this, you are implying that the individual variable names aren’t that important, you just loop over a list of the names. But there are already data structures for this. You can do

X = [zeros(4) for _ in 1:3]
tmp = [1, 2, 3]
for i in eachindex(X, tmp)
    X[i] .+= tmp[i]
end

or

X = Dict(:x => zeros(4), :y => zeros(4), :z => zeros(4))
tmp = Dict(:x => 1, :y => 2, :z => 3)
for (key, val) in tmp
    X[key] .+= val
end

DNF · June 30, 2022, 8:39pm

Yeah, I must have missed it. I was perhaps expecting a big red “DON’T”.

roi.holtzman · June 30, 2022, 8:52pm

Actually, the names do have a meaning. Therefore, it is actually much more readable to use the names.

In your way of writing, the meaning goes away, and one cannot know what is the 1st, 2nd, and 3rd elements.

DNF · June 30, 2022, 9:04pm

Well, the meaning is also lost (or at least obscured) when you are using eval:

eval(a)[i] += eval(Symbol(a, '_', "tmp"))^i

There’s no clear ordering with just names, “x”, “y” and “z” (you may perhaps infer it from the alphabet), so the ordering is much clearer if you use an array, because then you have indices 1, 2, and 3.

If the names have significant meaning, you can keep them using a Dict. I don’t see how there is more ordering in variable names than in the same names as field names in a Dict.

Jeff_Emanuel · June 30, 2022, 9:42pm

^^this^^, and structs encapsulating related derived values. Organize your data. Don’t pollute your namespace. struct fields have names, so that naming can be preserved.

roi.holtzman · July 1, 2022, 7:55am

First, let me take a more concrete example: say the first array is position, x, the second is velocity v, and the third is acceleration a. Now, these have a clear meaning.

From what I understand, you suggest I would keep my arrays in a dict.
Is there a better way to initiate it than this? (and later on to fill it?)
I mean using a for loop or something.
(Consider the case I have O(10) of these quantities).

d = Dict()
d["x"] = zeros(4)
d["v"] = zeros(4)
d["a"] = zeros(4)

nilshg · July 1, 2022, 8:02am

julia> Dict(["x", "v", "a"] .=> [zeros(4) for _ ∈ 1:3])
Dict{String, Vector{Float64}} with 3 entries:
  "v" => [0.0, 0.0, 0.0, 0.0]
  "x" => [0.0, 0.0, 0.0, 0.0]
  "a" => [0.0, 0.0, 0.0, 0.0]

Will give you a correctly typed dictionary (yours is Dict{Any, Any})

GunnarFarneback · July 1, 2022, 8:50am

That construction looks somewhat contrived. I would write it as

Dict(key => zeros(4) for key in ["x", "v", "a"])

nilshg · July 1, 2022, 8:55am

Ah yes, much nicer - not necessary to construct an array on the right hand side.

DNF · July 1, 2022, 9:55am

Thanks for making it more concrete.

If you are bundling together position, velocity and acceleration, presumably for some body, then I think you should consider creating a type (possibly mutable, depending on need):

struct Particle
    x::T
    v::T
    a::T
end

with whichever type T is useful for you (apparently a length-4 vector?) Then you can make arrays of Particles, if you need that.

What does O(10) mean? Perhaps even some more concrete background would be useful.

If you were to use Dict, I think using symbols is perhaps more appropriate: d[:x] = zeros(4)

Topic		Replies	Views
Looping Over Two Variables (indexes) in a DataFrame New to Julia question	1	263	June 29, 2022
Loop over the variable name inside of function New to Julia question	2	490	December 31, 2021
How to change the name of a variable in a for loop New to Julia	2	4058	September 7, 2019
Create variables in a loop New to Julia data	10	947	November 3, 2020
Access variable values New to Julia	6	459	February 2, 2020

Looping over variables names properly

Related topics