I’ve often had not-so-large DataFrames loaded, and performance with varinfo() was good. When a DataFrame exceeds a certain size, e.g. beyond 100 MB, varinfo() is very slow. (I didn’t check the exact size, but the one loaded now is about 165 MB.)
I’m wondering (without getting too technical). How is varinfo() handling it? All a person’s interested in is the mere size of the variable. Should it really take much more time to merely enumerate the variable list?
If I’m an ignorant fool in this matter, please explain.
Are you sure it is not a precompilation issue? varinfo most probably uses something very similar to summarysize which is a non-trivial function (as it traverses the object). You can check the code here.
Then its calculating the variable size that is slow. DataFrame can be a complex structure so probably if you have a lot of nested variable sized values i.e. vectors of strings can be a pain. I’m speculating here …