I’m looking for some background statistics on Julia and its ecosystem for a talk I’m preparing. In particular, I’d be interested in the evolution of number of downloads and the amount of code sharing in the ecosystem (I want to demonstrate The Unreasonable Effectiveness of Multiple Dispatch).
Do we have a repository where such information is gathered?
I appreciate all pointers to useful charts, plots, etc.!
Download statistics are generally a pretty bad measure of code use (e.g., 95% might just come from CI / CRON jobs). Currently there isn’t a good place to track that, but it might change in the future (e.g., GitHub registry statistics which is in beta phase, Pkg handling, etc.). One measure you could look at it is reverse dependencies. That you could get with DependenciesParser.jl and maybe LightGraphs.jl for visualizations. The core ecosystem probably revolves around StatsBase.jl and StatsModels.jl. GLM.jl providing the most common vanilla regression analysis functionality. JuliaStats has a few other packages. The machine learning ecosystem is a bit different if you are interested in that.
Following @Nosferican’s suggestion I tried to extract some ecosystem statistics/graphs using DependenciesParser.jl. You can find the code here. I obtained the following plots:
Forward dependencies (on how many packages does a given package depend)
The packages in the inner circle are “core” in that they define essential data types or functions, while the outer shell of packages are big “application packages”. If you will, this indicates a good reuse of existing implementations of data types.