I’m looking for some background statistics on Julia and its ecosystem for a talk I’m preparing. In particular, I’d be interested in the evolution of number of downloads and the amount of code sharing in the ecosystem (I want to demonstrate The Unreasonable Effectiveness of Multiple Dispatch).
Do we have a repository where such information is gathered?
I appreciate all pointers to useful charts, plots, etc.!
Download statistics are generally a pretty bad measure of code use (e.g., 95% might just come from CI / CRON jobs). Currently there isn’t a good place to track that, but it might change in the future (e.g., GitHub registry statistics which is in beta phase, Pkg handling, etc.). One measure you could look at it is reverse dependencies. That you could get with DependenciesParser.jl and maybe LightGraphs.jl for visualizations. The core ecosystem probably revolves around StatsBase.jl and StatsModels.jl. GLM.jl providing the most common vanilla regression analysis functionality. JuliaStats has a few other packages. The machine learning ecosystem is a bit different if you are interested in that.
I’d be very interested in seeing what you come up with, if you can share.
@viralbshah might have some numbers like that.
Viral already kindly sent me some data on slack. After I gave @Nosferican suggestion a try i’ll report back on what I gathered.
Some statistics can be found in Julia Computing’s regular January newsletter:
There, one also finds the following graph, showing the evolution of github stars over time:
Following @Nosferican’s suggestion I tried to extract some ecosystem statistics/graphs using DependenciesParser.jl. You can find the code here. I obtained the following plots:
Forward dependencies (on how many packages does a given package depend)
Reverse dependencies (how many packages depend on a given package)
Reverse dependencies graph
The packages in the inner circle are “core” in that they define essential data types or functions, while the outer shell of packages are big “application packages”. If you will, this indicates a good reuse of existing implementations of data types.