Convert Dataframe column containing strings "A+", "A", "A-" ..."D+", "D", "D-" to integers 12,11,10 ... 3,2,1

I have a dataframe where one of the columns is Grades. Grades contains strings for scores on the scale A+ to D-. How do I go about converting this information to integers instead on a 12-point scale (“A+” = 12, …, “D-” = 1)?

Could you create a Dict{String, Int8} for the mapping?

2 Likes
julia> using DataFramesMeta; 

julia> grades_df = DataFrame(grade = ["A+", "B", "B-"]);

julia> grades_mapping = Dict("A+" => 12, "A" => 11, "A-" => 10, "B+" => 9, "B" => 8, "B-" => 7);

julia> @rtransform grades_df :grade_num = grades_mapping[:grade]
3×2 DataFrame
 Row │ grade   grade_num
     │ String  Int64
─────┼───────────────────
   1 │ A+             12
   2 │ B               8
   3 │ B-              7
1 Like
julia> Dict(g => n for (n, g) in enumerate(hcat("DCBA"...) .* ["-", "", "+"]))
Dict{String, Int64} with 12 entries:
  "A"  => 11
  "C"  => 5
  "C-" => 4
  "D"  => 2
  "D+" => 3
  "B-" => 7
  "D-" => 1
  "B"  => 8
  "C+" => 6
  "B+" => 9
  "A+" => 12
  "A-" => 10

I get an error “ERROR: LoadError: UndefVarError: @rtransform not defined”

Make sure you are using DataFramesMeta.

I am indeed, but I still get the error. Could it be that I am using an old version of Julia? v1.6.28

1.6.28 is not a Julia version (latest patch release in the 1.6 series is 1.6.7).

The relevant question is what version of DataFramesMeta you’re using.

You can also work without DataFramesMeta and just do

grades_df.grade_num = [grades_mapping[x] for x in grades_df.grade]
1 Like

Please run ] up DataFramesMeta to get the latest version.

1 Like

Another variation to build the dictionary and broadcast it:

using DataFrames
df = DataFrame(grade = ["A+", "B", "B-", "C", "D-"])
grades_dic = Dict(vec(['A' 'B' 'C' 'D'] .* ["+","","-"]) .=> 12:-1:1)
df.grade_num = getindex.((grades_dic,), df.grade) 
df 
2 Likes