Looping over variables names properly

I have a few arrays, say x, y, z, and I want to take all of their means and put them into new variables as x_mean = mean(x).
Is there a way to do this in a for loop? Something like

for a in ["x", "y", "z"]
    $a_mean = mean($a)
end

You can do it with @eval:

julia> using Statistics: mean

julia> x, y, z = rand(3), rand(3), rand(3);

julia> for a in [:x, :y, :z]
           @eval $(Symbol(a, "_mean")) = mean($a)
       end

julia> x_mean
0.7142450574816145

Most of the time this is not a great idea though and you would be better off organizing your data differently, but it depends on how you are using it.

7 Likes

You could build something like a Dict that maps variable names to their mean

julia> d = Dict(name=>mean(val) for (name, val) in pairs((; x,y,z)))
Dict{Symbol, Float64} with 3 entries:
  :y => 0.189552
  :z => 0.744915
  :x => 0.551346
5 Likes

It looks like exactly what you do using packages like DataFrames or InMemoryDataset

ulia> df=DataFrame(x=rand(-50:50,10),y=rand(-5:5,10),z=rand(-10:10,10))
10×3 DataFrame
 Row │ x      y      z     
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │   -35     -3      6
   2 │    41      1     -6
   3 │    34     -3     -7
   4 │    13      2      9
   5 │   -48      5      2
   6 │    43     -2      4
   7 │    38      3      4
   8 │   -12      0     -9
   9 │   -45      3     -1
  10 │     4     -1      8

julia> Tables.columntable(combine(df,Cols(:).=>mean))
(x_mean = [3.3], y_mean = [0.5], z_mean = [1.0])


julia> copy(combine(df,Cols(:).=>mean)[1,:])
(x_mean = 3.3, y_mean = 0.5, z_mean = 1.0)

2 Likes

This is almost never a good idea, and it will only work in global scope. In local scope, this code will start creating global variables.

2 Likes

One could also use named tuples:

x, y, z = rand(3), rand(3), rand(3)
dmean = map(mean, (;x, y, z))

# and then access the results with:
dmean.x, dmean.y, dmean.z

NB:
Learned this from the excellent book in progress: Julia for Data Analysis, by Bogumił Kamiński

4 Likes

So what would be a good idea?

Isn’t this one of those cases where the right answer is: “Don’t do it.” Or maybe “Don’t do this, but if you must know here’s a way”? (Personally, I’m a big fan of telling people their question is wrong, rather than providing an answer :laughing: )

Use a Dict or maybe a Vector. Don’t dynamically create variable names, it makes your code unreadable.

1 Like

But actually, naming the variables does make my code readable, in my opinion.

I mean, I want to save some moments of some variables, x, y, z.
I have a vector named x of size, say, 4, to keep all its moments.
I have the same for y and z.
Now, I want to take the new observables I measure stored in x_tmp, y_tmp, z_tmp, and put them into x, y, z.

I thought about doing something like this.
Do you think it is more readable to type the entire thing?

x = zeros(4)
y = zeros(4)
z = zeros(4)

x_tmp = 1
y_tmp = 2
z_tmp = 3

for a in [:x, :y, :z]
  for i = 1:
    eval(a)[i] += eval(Symbol(a, '_', "tmp"))^i
  end
end

I must say that I do think I have an issue with the scope of variables, and therefore the way I am doing it now is not good. But I would like to save some on typing the same thing over and over.

Perhaps the best way is to dedicate a function for it.

I thought I kinda said that with

Most of the time this is not a great idea though and you would be better off organizing your data differently, but it depends on how you are using it.

below the example.

1 Like

When you write your code like this, you are implying that the individual variable names aren’t that important, you just loop over a list of the names. But there are already data structures for this. You can do

X = [zeros(4) for _ in 1:3]
tmp = [1, 2, 3]
for i in eachindex(X, tmp)
    X[i] .+= tmp[i]
end

or

X = Dict(:x => zeros(4), :y => zeros(4), :z => zeros(4))
tmp = Dict(:x => 1, :y => 2, :z => 3)
for (key, val) in tmp
    X[key] .+= val
end
2 Likes

Yeah, I must have missed it. I was perhaps expecting a big red “DON’T”.

Actually, the names do have a meaning. Therefore, it is actually much more readable to use the names.

In your way of writing, the meaning goes away, and one cannot know what is the 1st, 2nd, and 3rd elements.

Well, the meaning is also lost (or at least obscured) when you are using eval:

eval(a)[i] += eval(Symbol(a, '_', "tmp"))^i

There’s no clear ordering with just names, “x”, “y” and “z” (you may perhaps infer it from the alphabet), so the ordering is much clearer if you use an array, because then you have indices 1, 2, and 3.

If the names have significant meaning, you can keep them using a Dict. I don’t see how there is more ordering in variable names than in the same names as field names in a Dict.

2 Likes

^^this^^, and structs encapsulating related derived values. Organize your data. Don’t pollute your namespace. struct fields have names, so that naming can be preserved.

1 Like

First, let me take a more concrete example: say the first array is position, x, the second is velocity v, and the third is acceleration a. Now, these have a clear meaning.

From what I understand, you suggest I would keep my arrays in a dict.
Is there a better way to initiate it than this? (and later on to fill it?)
I mean using a for loop or something.
(Consider the case I have O(10) of these quantities).

d = Dict()
d["x"] = zeros(4)
d["v"] = zeros(4)
d["a"] = zeros(4)
julia> Dict(["x", "v", "a"] .=> [zeros(4) for _ ∈ 1:3])
Dict{String, Vector{Float64}} with 3 entries:
  "v" => [0.0, 0.0, 0.0, 0.0]
  "x" => [0.0, 0.0, 0.0, 0.0]
  "a" => [0.0, 0.0, 0.0, 0.0]

Will give you a correctly typed dictionary (yours is Dict{Any, Any})

1 Like

That construction looks somewhat contrived. I would write it as

Dict(key => zeros(4) for key in ["x", "v", "a"])
2 Likes

Ah yes, much nicer - not necessary to construct an array on the right hand side.

Thanks for making it more concrete.

If you are bundling together position, velocity and acceleration, presumably for some body, then I think you should consider creating a type (possibly mutable, depending on need):

struct Particle
    x::T
    v::T
    a::T
end

with whichever type T is useful for you (apparently a length-4 vector?) Then you can make arrays of Particles, if you need that.

What does O(10) mean? Perhaps even some more concrete background would be useful.

If you were to use Dict, I think using symbols is perhaps more appropriate: d[:x] = zeros(4)

1 Like