# How to calcule the mean of values considering their tuples of another value:

Hi all, I have an issue that I canβt figure out how to solve it:

I have a df that stores two columns that look like this:

``````df.scores1 = [[1,2,2,3,5,6,1,2,9,2,1,6,4,2]]

df.normalized_len = [[0,0,0.1,0.1,0.2,0.3,0.4,0.5,0.5,0.6,0.7,0,8,0.9,1]]
``````

And this goes on for many rows, all of them having the same length.

What I am trying to do is get the mean values of `df.scores1` that have the same value of `df.normalized_len`, so the result should look like this:

`df.mean_val_norm = [[1.5,2.5,5,6,1,5.5,2,1,6,4,2]] `

Any help is welcome!

Thanks a lot,
Juan

I donβt understand the question - could you explain how the values in your desired output `mean_val_norm` are derived?

You can also use the DataFrames documentation to form a DataFrame that is grouped with these values:

``````x = [1,2,2,3,5,6,1,2,9,2,1,6,4,2]
y = [0,0,0.1,0.1,0.2,0.3,0.4,0.5,0.5,0.6,0.7,0.8,0.9,1]
df = DataFrame(scores1=x, normalized_len=y)
gb=groupby(df, :normalized_len)
println(combine(gb, :scores1 => mean))
``````

Output:

``````11Γ2 DataFrame
Row β normalized_len  scores1_mean
β Float64         Float64
ββββββΌββββββββββββββββββββββββββββββ
1 β            0.0           1.5
2 β            0.1           2.5
3 β            0.2           5.0
4 β            0.3           6.0
5 β            0.4           1.0
6 β            0.5           5.5
7 β            0.6           2.0
8 β            0.7           1.0
9 β            0.8           6.0
10 β            0.9           4.0
``````
2 Likes

Sorry, I just realized my post was kinda vague.

I want to get the mean values of the items in `df.score1` that share the same value in `df.normalized_len`. So, for all the values that a `normalized_len` of 0 (the first two ones) would get summed and divided by two (because there are only two values that have that normalized len), and so on. Is it clearer now?

Hi, someone posted the correct answer but then deleted it! Just in case anyone has the same issue, this was it:

``````m = [mean(x[findall(==(u), y)]) for u in unique(y)]
``````

In the end I used `@rtransform` like this and it worked great:

``````df_t = @rtransform df_t1 :mean_pos_rep = begin ##
[mean(:sum_total[findall(==(u), :norm_length)]) for u in unique(:norm_length)]##
end;

``````

Cheers,
Juan

But isnβt the correct answer the one proposed by @JorizovdZ?

This isnβt bad either, but maybe itβs not the first that comes to mind.

It works too, but the result that I intended was the one that got deleted. But, as the answer is still posted, Iβll mark it as the correcto solution.