How to group by multiple data types?

Your JOINT_PROB and DOC_COUNT have both eltype Any which signals that there is a risk that they do not contain only numbers. If they contain only numbers then the following will work:

using Statistics
p_i_group = groupby(probabilities_data, [:DATE, :TOPIC_I])
pi_df = combine(p_i_group, :JOINT_PROB => sum => :PROB_I, :DOC_COUNT => mean => :MEAN_COUNT)

the easiest way to check if your column contains only numbers is to do e.g. float.(probabilities_data.JOINT_PROB). If this errors this means that you have some bad data in your columns.

1 Like