Using a DataFrame to calculate another column in a separate DataFrame

Thanks for the reply. Sorry, I think I explained my issue poorly. So I need to call num_invoices as the function in transform!, so the two hours comes from ~0.007 seconds times the 1.25 million rows in the DataFrame that transform! updates. I agree that your implementation is cleaner and more concise, but am struggling to find a way to speed this up beyond that…