Hi all,
This is not intended to be a knock on the excellent Distributions.jl package - we’ve used it happily for several years in LightGraphs and it’s working wonderfully.
We currently have an initiative to make LightGraphs lighter weight in terms of its dependencies, though, and we’re running into a bit of an issue. We use Distributions for exactly one thing: a random number generator using binomial distributions for one of our random graph generators. The problem is that this single function adds at least 12 new dependencies to our dependency tree, including some heavy hitters. Again, we’re trying to lighten up LightGraphs right now, and to have this number of dependencies for a single function seems like a bit of an overkill.
So - is there any other way to get a binomial random number generator that would allow us to achieve our weight-loss goal?
3 Likes
The code for that in Distributions is exactly
rand(d::Binomial) = convert(Int, StatsFuns.RFunctions.binomrand(d.n, d.p))
I guess you could simply copy that? StatsFuns itself requires only Rmath and SpecialFunctions.
Right. Rmath requires BinDeps and a few other things, though, and I’d really like to avoid BinDeps if at all possible since it introduces cross-platform fragility and has a lot of overhead. (Again, not knocking it, just trying to put the “Light” back in “LightGraphs”.)
If you want to go really lightweight and you know that your sample sizes will be sufficiently large, you can use Base.randn
, since the binomial distribution converges to a standard normal as n increases (see Central Limit Theorem).
1 Like
We’re using it exactly once per graph, in order to get a single value that represents the number of edges to generate (m = |V|^2
, and I don’t know about p
) so I’m not sure randn
will work – but I will defer to stats experts on this.
Check out https://github.com/JuliaStats/Distributions.jl/blob/master/src/samplers/binomial.jl. There are some samplers written in Julia. We don’t use them by default but I think they work.
1 Like
It’s weird that there isn’t a pure-Julia version of Distributions.jl - writing efficient samplers seems to be something Julia is uniquely suited for.
3 Likes
Yeah, but people want to “get everything done” first.
So, here’s what I wound up doing:
# taken from http://stackoverflow.com/questions/23561551/a-efficient-binomial-random-number-generator-code-in-java
function _getBinomial(n::Integer, p::Real, seed::Integer=-1)
rng = getRNG(seed)
log_q = log(1.0 - p)
x = 0
sum = 0.0
while true
sum += log(rand(rng)) / (n - x)
sum < log_q && break
x += 1
end
return x
end
Passes all tests so far.
You probably need to have an attribution in your license for this. Make sure you add that.
Just wanted to say that the fact that LightGraphs has so many dependencies was a main reason for me not to use it and instead roll my own, so I very much appreciate this direction.
1 Like
@tkoolen - thanks. Take a look at https://github.com/JuliaGraphs/LightGraphs.jl/issues/557#issuecomment-292806096 for the current status. We will be down to 4 direct dependencies, with 8 total. I think that’s as good as we’re going to get.
2 Likes