Sure. I think there were a couple things that helped here.
- We used “heavy” packages in MCMCChains, such as DataFrames. DataFrames were used to report parameter statistics and to do some intermediate numerical calculations, but they have a really high spin up time at a place when it was not helpful. Since they were mostly being used for display purposes, we removed the DataFrames backend and just wrote our own print/display functionality. You can still use some DataFrames functionality, but it’s lazily loaded with Requires.
- One of the fixes was a weird one, which I think was mostly due to type inference problems. You can see the discussion here, but the fix amounted to basically re-configuring how MCMCChains handled chain concatenation.
- I had extended some functions from Base in a way that caused a bunch of precompilation difficulties. One line in particular caused the MCMCChains load time to decrease from ~20 seconds to ~6 seconds was to change
Base.convert(::Type{T}, cs::Array{ChainDataFrame,1}) where T<:Array
to
Base.convert(::Type{Array}, cs::Array{C,1}) where C<:ChainDataFrame
This one was very hard to find – I just commented out files one at a time until I saw big performance improvements, and then I started commenting out specific functions in files that caused big performance drops. Impractical for very large projects, but I think MCMCChains is small enough that I can do this sort of thing.
@devmotion do you have any other comments that might be helpful?