I’m looking to add a column to my DataFrame combo2 that represents the ranking of DK_points by each Pos (position) group with the following code.
sort!(combo2, [:Pos, :DK_points], rev=[true, true])
dk_rank = []
for pos in groupby(combo2, :Pos)
append!(dk_rank, 1:size(pos, 1))
end
insertcols!(combo2, 6, DK_rank=dk_rank)
However, one particular value for Pos, “QB”, gives unusual results with the ranking starting at 14 rather than the desired 1.
by(combo2, :Pos, y->DataFrame(minRank = minimum(y[:,:DK_rank])))
Pos minRank
String Int64
1 QB 14
2 RB 1
3 WR 1
4 TE 1
5 DST 1
Very confused why this would happen.
Also, wondering if there are better ways to accomplish adding a ranking for multiple columns in a data set by each group of another particular column.
First, how you can do it more easily is:
by(combo2, :Pos, y -> (DK_rank=axes(y, 1)))
if your data frame is sorted
or
by(combo2, :Pos, y -> (DK_rank=sortperm(y, :DK_points, rev=true)))
if it is not sorted yet.
Now the problem you encounter is most likely related to sorting (but I am not sure). Can you please give the result of the following operation (or share your source data frame):
by(combo2, :Pos, y->DataFrame(len=nrow(y), rankrange = extrema(y[:,:DK_rank], isok=issorted(y[:,:DK_rank]))))
Additionally - I have just checked your code on some random data and I could not reproduce the problem.
Pos len rankrange isok
String Int64 Tuple… Bool
1 QB 39 (14, 52) true
2 RB 53 (1, 56) false
3 WR 56 (1, 39) false
4 TE 52 (1, 53) false
5 DST 33 (1, 33) true
Appears that the append order is out of whack. But not really sure why.
can you send me the data privately if you cannot share them openly? (as I have said - I have checked your codes on random data and they are OK). Also please confirm if the two methods I suggested work correctly on your data.
If I check for “sortedness” just on DK_points, seems sorting indeed didn’t work correctly.
sort!(combo2, [:Pos, :DK_points], rev=[true, true])
by(combo2, :Pos, y->DataFrame(len=nrow(y), isok=issorted(y[:,:DK_points])))
Pos len isok
String Int64 Bool
1 QB 39 false
2 RB 53 false
3 WR 56 false
4 TE 52 false
5 DST 33 false
With both of the methods you provided, if I insert the rank column back to the dataframe using
m = by(combo2, :Pos, y -> (DK_rank=sortperm(y, :DK_points, rev=true)))[:, :x1]
insertcols!(combo2, 6, DK_rank=m)
I get the same result as before
Please make sure that your combo2
data frame is sorted. In your post :Pos
column does not seem to be sorted in reverse order. The order should be like:
5×2 DataFrame
│ Row │ Pos │ minRank │
│ │ String │ Int64 │
├─────┼────────┼─────────┤
│ 1 │ WR │ 1 │
│ 2 │ TE │ 1 │
│ 3 │ RB │ 1 │
│ 4 │ QB │ 1 │
│ 5 │ DST │ 1 │