Benchmarking / profiling cache use

Thank you! Google completely failed me there.

This tool is absolutely fantastic; just kindly asking the kernel is obviously much better than a rube-goldberg construction to pass compiled julia into command line tools.

At some point there should be a sticky post / wiki describing these tricks. Carnaval’s IACA.jl also looks extremely interesting.