I’m looking to add a column to my DataFrame combo2 that represents the ranking of DK_points by each Pos (position) group with the following code.
sort!(combo2, [:Pos, :DK_points], rev=[true, true])
dk_rank = []
for pos in groupby(combo2, :Pos)
append!(dk_rank, 1:size(pos, 1))
end
insertcols!(combo2, 6, DK_rank=dk_rank)
However, one particular value for Pos, “QB”, gives unusual results with the ranking starting at 14 rather than the desired 1.
by(combo2, :Pos, y->DataFrame(minRank = minimum(y[:,:DK_rank])))
Pos minRank
String Int64
1 QB 14
2 RB 1
3 WR 1
4 TE 1
5 DST 1
Very confused why this would happen.
Also, wondering if there are better ways to accomplish adding a ranking for multiple columns in a data set by each group of another particular column.
First, how you can do it more easily is:
by(combo2, :Pos, y -> (DK_rank=axes(y, 1)))
if your data frame is sorted
or
by(combo2, :Pos, y -> (DK_rank=sortperm(y, :DK_points, rev=true)))
if it is not sorted yet.
Now the problem you encounter is most likely related to sorting (but I am not sure). Can you please give the result of the following operation (or share your source data frame):
by(combo2, :Pos, y->DataFrame(len=nrow(y), rankrange = extrema(y[:,:DK_rank], isok=issorted(y[:,:DK_rank]))))
Additionally - I have just checked your code on some random data and I could not reproduce the problem.
Pos len rankrange isok
String Int64 Tuple… Bool
1 QB 39 (14, 52) true
2 RB 53 (1, 56) false
3 WR 56 (1, 39) false
4 TE 52 (1, 53) false
5 DST 33 (1, 33) true
Appears that the append order is out of whack. But not really sure why.
bkamins
August 19, 2019, 10:00am
#5
can you send me the data privately if you cannot share them openly? (as I have said - I have checked your codes on random data and they are OK). Also please confirm if the two methods I suggested work correctly on your data.
If I check for “sortedness” just on DK_points, seems sorting indeed didn’t work correctly.
sort!(combo2, [:Pos, :DK_points], rev=[true, true])
by(combo2, :Pos, y->DataFrame(len=nrow(y), isok=issorted(y[:,:DK_points])))
Pos len isok
String Int64 Bool
1 QB 39 false
2 RB 53 false
3 WR 56 false
4 TE 52 false
5 DST 33 false
With both of the methods you provided, if I insert the rank column back to the dataframe using
m = by(combo2, :Pos, y -> (DK_rank=sortperm(y, :DK_points, rev=true)))[:, :x1]
insertcols!(combo2, 6, DK_rank=m)
I get the same result as before
Please make sure that your combo2
data frame is sorted. In your post :Pos
column does not seem to be sorted in reverse order. The order should be like:
5×2 DataFrame
│ Row │ Pos │ minRank │
│ │ String │ Int64 │
├─────┼────────┼─────────┤
│ 1 │ WR │ 1 │
│ 2 │ TE │ 1 │
│ 3 │ RB │ 1 │
│ 4 │ QB │ 1 │
│ 5 │ DST │ 1 │