Just to pile on to my previous comment, making DataFrame operations nearly as performant as operations on Vectors or Arrays would be I think a dream outcome. For example, I am doing an optimization problem involving simulating panels of data. There is a nontrivial cost to doing calculations on this when the simulated data is in a DataFrame versus the approach where keep everything in arrays like I’m a MATLAB-loser. Of course it is much nicer to write the DataFrames-style code, which is a huge plus, but when the performance gap is hit millions of times it starts to really add up.

It is all still faster than MATLAB anyway so why am I even complaining?