## Problem

Does Julia have an efficient way of **constructing a huge sparse matrix from a given list of entries u,v,w**, some of which can have the

**same locations**

`u,v`

, and in that case, their weights `w`

must be **summed**. Thus

`u,v,w`

are input vectors and I wish to create a sparse matrix that has value `w[i]`

at position `u[i],v[i]`

. For instance, the Mathematica code```
n=10^6; m=500*n;
u=RandomInteger[{1,n},m];
v=RandomInteger[{1,n},m];
w=RandomInteger[{-9, 9}, m]; AbsoluteTiming[
SetSystemOptions["SparseArrayOptions"->{"TreatRepeatedEntries"->1}];
a= SparseArray[{u,v}\[Transpose] -> w, {n,n}]; ]
```

requires 135sec and 60GB of RAM. The equivalent Python code

```
import scipy.sparse as sp
import numpy as np
import time
def ti(): return time.perf_counter()
n=10**6; m=500*n;
u=np.random.randint(0,n,size=m);
v=np.random.randint(0,n,size=m);
w=np.random.randint(-9,10,size=m); t0=ti();
a=sp.csr_matrix((w,(u,v)),shape=(n,n),dtype=int); t1=ti(); print(t1-t0)
```

needs 36sec and 20GB, but doesnâ€™t support (2). The equivalent Julia code

```
using SparseArrays;
m=n=10^6; r=500*n;
u=rand(1:m,r);
v=rand(1:n,r);
w=rand(-9:9,r);
function f() a=sparse(u,v,w,m,n,+); end;
@time f()
```

needs 155sec and 26GB (the @time macro incorrectly reports that it used only 15GB).

Is there a way to **make this construction more efficient**?

Are there any **Julia packages** that are better at this?

## Background

I have created a package for *linear algebra* and *homological algebra* computations. I did it in Mathematica, since the `SparseArray`

implementation (0) there is

- very efficient (fast and low RAM usage),
- supports exact fractions as matrix entries.
- supports polynomials as matrix entries.

However, it is not

- parallelized
- open source
- GPU or cluster enabled.

I am considering migrating to a different programming language. I have tried Python `SciPy.sparse`

, but it doesnâ€™t satisfy (2). Iâ€™m not sure if C++ libraries support (2). As a test, I tried sorting an array of floats of size 10^9 and compared the performance of Mathematica, Python `numpy`

, and Julia `ThreadsX`

. I was incredibly impressed by the latter (3x faster than numpy, 10x than Mathematica). I am considering migrating to Julia, but I wish to first make sure that my library would perform better than in Mathematica.

P.S. I also asked this question here.