Why `sprand` does not accept a Distribution arg?

Hi there,
I wonder why this cannot work out (What is the available method currently then?).

julia> import SparseArrays.sprand as sprand

julia> import Distributions.Uniform as UD

julia> sprand(5, 6, .7)
5×6 SparseArrays.SparseMatrixCSC{Float64, Int64} with 23 stored entries:
 0.213111  0.611876  0.592097   ⋅         ⋅         ⋅ 
 0.50981   0.326668  0.792674  0.227358  0.322273  0.616509
 0.9323    0.136838  0.56782   0.843999  0.718162  0.475489
 0.222931  0.582418   ⋅        0.412739   ⋅        0.942747
  ⋅        0.290387   ⋅        0.876276  0.449718  0.408883

julia> rand(UD(-5, 4), 5, 6)
5×6 Matrix{Float64}:
 -3.24126  -0.896462  -2.59361    0.11748   0.917439  -0.742394
  3.61575   3.95655   -4.05713   -2.69737  -0.846111  -3.27312
 -2.39694  -0.194496   0.946839  -3.9091    2.11152   -4.85219
  2.77711   0.618885  -1.53174   -4.30585   2.2522     1.27031
 -1.65799  -1.4638    -4.91182   -1.24282   2.79811    3.06926

julia> sprand(UD(-5, 4), 5, 6, .7)
ERROR: MethodError: no method matching sprand(::Distributions.Uniform{Float64}, ::Int64, ::Int64, ::Float64)
The function `sprand` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  sprand(::Type{T}, ::Integer, ::Integer, ::AbstractFloat) where T
   @ SparseArrays K:\julia-1.11.4\share\julia\stdlib\v1.11\SparseArrays\src\sparsematrix.jl:2048
  sprand(::Random.AbstractRNG, ::Integer, ::Integer, ::AbstractFloat)
   @ SparseArrays K:\julia-1.11.4\share\julia\stdlib\v1.11\SparseArrays\src\sparsematrix.jl:2042
  sprand(::Random.AbstractRNG, ::Integer, ::Integer, ::AbstractFloat, ::Function, ::Type{T}) where T
   @ SparseArrays K:\julia-1.11.4\share\julia\stdlib\v1.11\SparseArrays\src\sparsematrix.jl:2027
  ...

Stacktrace:
 [1] top-level scope
   @ REPL[6]:1

Because Distributions.jl didn’t implement it I guess. And in general you may need two sets of parameters, one for deciding which coefficients are nonzero and one for the distribution of these nonzero coefficients.

I guess you need to initialize a matrix first and then transform only its nonzeros:

julia> using SparseArrays, Distributions, Random

julia> rng = Xoshiro(1)
Xoshiro(0xfff0241072ddab67, 0xc53bc12f4c3f0b4e, 0x56d451780b2dd4ba, 0x50a4aa153d208dd8, 0x3649a58b3b63d5db)

julia> M = sprand(rng, Bool, 10, 10, 0.5)
10×10 SparseMatrixCSC{Bool, Int64} with 47 stored entries:
 1  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  1  1
 1  1  ⋅  1  1  1  ⋅  ⋅  1  1
 ⋅  1  1  ⋅  ⋅  ⋅  1  1  1  ⋅
 ⋅  1  ⋅  ⋅  1  1  1  1  1  ⋅
 ⋅  ⋅  1  1  1  ⋅  ⋅  ⋅  1  ⋅
 1  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅
 ⋅  1  1  1  1  ⋅  1  1  1  1
 ⋅  1  ⋅  1  ⋅  ⋅  1  ⋅  ⋅  ⋅
 ⋅  1  ⋅  1  1  ⋅  1  ⋅  1  ⋅
 1  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅

julia> N = similar(M, Float64)
10×10 SparseMatrixCSC{Float64, Int64} with 47 stored entries:
 1.00388e218    ⋅             ⋅            2.7451e-57     ⋅             ⋅         ⋅          ⋅         2.47e-322  0.0
 6.8074e199    1.38998e218    ⋅            9.02134e217   1.53991e-71   1.8e-322   ⋅          ⋅         2.5e-322   2.87e-322
  ⋅            7.52133e130   2.00183e-23    ⋅             ⋅             ⋅        1.93e-322  2.27e-322  2.57e-322   ⋅ 
  ⋅            2.84185e-52    ⋅             ⋅            3.88592e-109  1.9e-322  2.0e-322   2.3e-322   2.67e-322   ⋅ 
  ⋅             ⋅            1.42534e-23   7.53433e130   4.86184e-42    ⋅         ⋅          ⋅         2.7e-322    ⋅ 
 1.44062e-176   ⋅             ⋅             ⋅            2.36953e-177   ⋅         ⋅          ⋅          ⋅          ⋅ 
  ⋅            7.55274e130   1.44062e-176  1.44062e-176  2.21129e-312   ⋅        2.08e-322  2.37e-322  2.8e-322   0.0
  ⋅            1.44062e-176   ⋅            2.7451e-57     ⋅             ⋅        2.1e-322    ⋅          ⋅          ⋅ 
  ⋅            1.37144e-71    ⋅            9.02134e217   1.73e-322      ⋅        2.17e-322   ⋅         2.9e-322    ⋅ 
 1.44062e-176  1.00838e-176   ⋅             ⋅             ⋅             ⋅         ⋅          ⋅         2.96e-322   ⋅ 

julia> rand!(rng, Uniform(2, 3), nonzeros(N));

julia> N
10×10 SparseMatrixCSC{Float64, Int64} with 47 stored entries:
 2.62901   ⋅        ⋅       2.64196   ⋅        ⋅        ⋅        ⋅       2.3656   2.12425
 2.15288  2.85686   ⋅       2.1277   2.08739  2.56568   ⋅        ⋅       2.55904  2.90843
  ⋅       2.09181  2.169     ⋅        ⋅        ⋅       2.0948   2.95284  2.56943   ⋅ 
  ⋅       2.22459   ⋅        ⋅       2.42544  2.49985  2.74462  2.75591  2.68652   ⋅ 
  ⋅        ⋅       2.55349  2.84995  2.06013   ⋅        ⋅        ⋅       2.82558   ⋅ 
 2.57669   ⋅        ⋅        ⋅       2.76191   ⋅        ⋅        ⋅        ⋅        ⋅ 
  ⋅       2.83915  2.69511  2.3484   2.27129   ⋅       2.90834  2.75925  2.60227  2.67956
  ⋅       2.90274   ⋅       2.24145   ⋅        ⋅       2.96495   ⋅        ⋅        ⋅ 
  ⋅       2.73817   ⋅       2.17797  2.00311   ⋅       2.65701   ⋅       2.17602   ⋅ 
 2.74252  2.19169   ⋅        ⋅        ⋅        ⋅        ⋅        ⋅       2.53072   ⋅ 
2 Likes

Thank you very much.
Your code is a bit advanced for me, maybe I will go with this

import SparseArrays
import Distributions.Uniform as UDistri
import Distributions.Bernoulli as BDistri
function sp_rand(D, I, J, p)
    Bmat, Mmat = rand(BDistri(p), I, J), rand(D, I, J)
    return SparseArrays.sparse(Mmat .* Bmat)
end

If this is the case, I’ll suggest them to add this method. Since this is common in research work :slight_smile:

Here it is with some comments:

# import necessary packages
using SparseArrays, Distributions, Random
# initialize a random number generator with a seed so that
# results are reproducible (not mandatory but good practice)
rng = Xoshiro(1) 
# generate a Boolean "'mask" matrix to decide where the nonzeros are
M = sprand(rng, Bool, 10, 10, 0.5)
# allocate a matrix of the same size and nonzero structure
# but with a floating-point element type
N = similar(M, Float64)
# modify the non-zero coefficients of N in-place
# with the distribution of your choice
UD = Uniform(2, 3)
rand!(rng, UD, nonzeros(N));
# voila!

You can but it will be much less efficient in high dimension since you generate dense matrices before converting them to sparse format.

If you want the devs to be aware of that, the best way is to open an issue on the Distributions.jl repo.

2 Likes

I’m clearer now. But this is my first time learning about rand!, I’m still a bit confused.
If I want to in-place substitute the second row of a matrix, what should I do?

julia> b = 9.9 * ones(3, 3)
3×3 Matrix{Float64}:
 9.9  9.9  9.9
 9.9  9.9  9.9
 9.9  9.9  9.9

julia> rand!(b[2, :])
3-element Vector{Float64}:
 0.3852825154281183
 0.7043852897670511
 0.9401658209886775

julia> b   # This is unaltered
3×3 Matrix{Float64}:
 9.9  9.9  9.9
 9.9  9.9  9.9
 9.9  9.9  9.9

As a comparison, rand!(b) works. :thinking:

1 Like

This creates a copy of the values. You need to use a view instead. Try:

julia> rand!(view(b, 2, :))

2 Likes

I guess that the Bool and similar lines can be skipped,
whereby we end up with this (can we?)

function sp_rand(Distri, I, J, p)
    M = sprand(I, J, p)
    rand!(Distri, nonzeros(M))
    M
end

will the “Bool-precursor” (your version) be faster?

For the first time I know the usage of view (LOL).
The functionalities in Julia is something like:
if you don’t have an opportunity to use it, you’ll never have interests to learn it.

You can use the rfn argument of sprand for this:

julia> using Distributions, SparseArrays

julia> sprand(5,6,0.7, n -> rand(Uniform(-5,4),n))
5×6 SparseMatrixCSC{Float64, Int64} with 22 stored entries:
 -0.0667353  -0.758496   2.84818   ⋅       -0.0465101  -3.6376
   ⋅           ⋅         0.42936  2.03778   1.16782      ⋅ 
  0.732217    1.00664   -3.02505  2.40859  -4.59048     2.14958
   ⋅          2.42693    1.88046   ⋅        2.13845      ⋅ 
  2.06068    -4.15294   -3.18514   ⋅       -3.74181    -4.91287

Distributions.jl could make this easier by adding a method:

SparseArrays.sprand(dist::Sampleable, m::Integer, n::Integer, density::AbstractFloat) = sprand(m, n, density, n -> rand(dist, n))

I see that you already opened an issue (May I suggest adding the sprand(::Distributions.Distribution, ...) method, like rand() does? · Issue #1955 · JuliaStats/Distributions.jl · GitHub)

3 Likes

Yeah they might be superfluous indeed, I just wanted to show the conceptual difference between deciding the sparsity pattern (which can be done with Bool values) and drawing the actual coefficients.
I also didn’t know about rfn, which sounds like the right solution.