I’m trying to test the supposed boost in performance (as seen from Matlab) when AMD reordering is used before factorization with the `lu`

when solving Ax = b, where A is a real valued n \times n sparse matrix. Here is my attempt, with the same data (A is about 5200 \times 5200) loaded from the Matlab function (original formulation is J\Delta x = -F):

```
using SparseArrays, LinearAlgebra, AMD, BenchmarkTools, MAT
A = matread("test/case2869_J.mat")["J"];
b = -matread("test/case2869_F.mat")["F"];
p = amd(A);
native_sol(A, b) = A \ b
lu_sol(A, b) = lu(A) \ b
function lu_amd_sol(A, b, p)
x = zero.(b)
x[p] = lu(A[p,p]) \ b[p]
return x
end
@info("native"); @btime native_sol($A, $b); @assert(norm(A*native_sol(A,b) - b) <= 1e-10)
@info("lu"); @btime lu_sol($A, $b); @assert(norm(A*lu_sol(A,b) - b) <= 1e-10)
@info("lu amd"); @btime lu_amd_sol($A, $b, $p); @assert(norm(A*lu_amd_sol(A,b,p) - b) <= 1e-10)
```

Here is the result:

```
[ Info: native
13.180 ms (76 allocations: 7.79 MiB)
[ Info: lu
13.296 ms (74 allocations: 7.75 MiB)
[ Info: lu amd
13.655 ms (99 allocations: 8.66 MiB)
```

On Matlab the AMD+LU method is about 2.8 times faster (~4.9ms on average). Not only there is no performance improvement, the built in `\`

operator is even faster than either `lu`

or `lu_amd`

.

The sparsity pattern on J before and after AMD reordering are (as plotted with `spy`

on Matlab, and `Plot.spy`

on Julia gives the same result):

Any idea why this might be? Perhaps I’m doing something wrong?

Thanks in advance!