Hello all,
I’m a bit new to the Julia programming language and haven’t been able to find an answer that solves my problem. I have the following dataframe:
10×5 DataFrame
Row │ DATE TOPIC_I TOPIC_J JOINT_PROB DOC_COUNT
│ Date String15 String15 Any Any
──┼─────────────────────────────────────────
1 │ 2000-09-01 TOPIC_153 TOPIC_87 0.03806723138 979
2 │ 2000-09-01 TOPIC_81 TOPIC_87 0.01825187194 979
3 │ 2000-09-01 TOPIC_249 TOPIC_87 0.01616933848 979
4 │ 2000-09-01 TOPIC_124 TOPIC_87 0.006607188145 979
5 │ 2000-09-01 TOPIC_140 TOPIC_87 0.0008916937195 979
6 │ 2000-09-01 TOPIC_101 TOPIC_87 0.001341542903 979
7 │ 2000-09-01 TOPIC_89 TOPIC_87 0.07842244991 979
8 │ 2000-09-01 TOPIC_233 TOPIC_87 0.01956784903 979
9 │ 2000-09-01 TOPIC_144 TOPIC_87 0.01501348474 979
10 │ 2000-09-01 TOPIC_201 TOPIC_87 0.007407990334 979
I am trying to group by the DATE and TOPIC_I rows, sum the JOINT_PROB rows and take the average of the DOC_COUNT rows. I have implemented the code below:
# Convert the joint probabilities column and document count column to the correct types.
stuff = [typeof(x) for x in probabilities_data[!, :JOINT_PROB]]
println(unique(stuff))
probabilities_data[!, :JOINT_PROB] = [typeof(x) == String ? tryparse(Float64,x) : x for x in probabilities_data[!, :JOINT_PROB]]
stuff = [typeof(x) for x in probabilities_data[!, :JOINT_PROB]]
println(unique(stuff))
p_i_group = groupby(probabilities_data, [:DATE, :TOPIC_I])
pi_df = combine(p_i_group, :TOPIC_J => sum => :PROB_I)
I countinue to get the following error related to the last line of code above:
TaskFailedException:
MethodError: no method matching +(::String15, ::String15)
As far as I can tell my syntax is correct. Can someone help me find what I am missing?