Help improving the speed of a DataFrames operation

The code is defined inside a function, say main, that I have not written here.
The df_flows variable is defined inside that function.
Instead of writing functions, I am using the explicit access to the variable df_flows.rp. Since df_flows is a local variable, I think that using a function is not necessary here, right? I was just trying to make the example readable.

About the use case, it is a left join that I then group. I can try to explain below what I want:

Explanation in join terms:

  • leftjoin(df_cons, df_flows, on = [:rp, :asset => :to])
  • Compute the intersection of the time_block of left and right
  • Multiply the resulting value by the flow column
  • Sum flow by grouping by df_cons’ index

Per row explanation:

  • For each row of df_cons
  • Select/filter df_flows by matching rp = row.rp and to = row.asset
  • Compute the intersection of the time blocks
  • Multiply the resulting value by the flow column
  • Sum flow and return

My current solution is to not do any use of DataFrames, and just use Dictionaries to store the indices of the non-zero flows. It is slow, but around 10x faster that this version. The full context is Speeding up JuMP model creation with sets that depend on other indexes - #6 by slwu89