I was wondering if there is a way to programmatically access juliahub packages (not the computation platform). I want to query package names, stars, install stats, etc.
Either an API or exported dataset (regularly updated) would do. I want to query the ecosystem, and I thought that would be a much cleaner way than trying to scrape github somehow.
I couldn’t find anything relevant in the docs for juliahub nor juliahub.jl
mbauman
December 12, 2024, 9:08pm
2
With respect to package usage, JuliaHub is a consumer of the community statistics, available here:
After some years of getting the package server architecture in place and working (mostly) reliably, @staticfloat and I finally had some time to work on collecting logs into a data warehouse (we’re using Snowflake ) and designing a set of queries over the logs that we can run and publish regularly. The current public aggregated stats are available with the prefix
https://julialang-logs.s3.amazonaws.com/public_outputs/current/
followed by a rollup name and the suffix .csv.gz indicating that all t…
That doesn’t include GitHub stargazer counts, however. I don’t believe we have a public API for our cache of it, but GitHub does:
oh, thanks. I couldn’t find that post with discourse search.
EDIT: sorry @mbauman , how about things like dependencies, repo url etc.
Does JuliaHub parse these from here directly (GitHub - JuliaRegistries/General: The official registry of general Julia packages )?
and compute other things like “dependants”, or the readme itself?
I assume JuliaHub-packages is not open source?