Why is ldiv! not updating the first argument when it is written in permutated format?

Amro · January 19, 2023, 8:15pm

Hi all,
Is there any reason that x_perm is not updated in ldiv! similarly as in x?

A = sparse([1.0 0 0; 0 6 0; 1 0 9]);
b = ones(3);
x = zeros(3);
x_perm = zeros(3);
permV = [3, 1, 2]; # permV is an permutated vector
ldiv!(x, lu(A), b); # x is updated to [1.0, 0.16666666666666666, 0.0]
ldiv!(x_perm[permV], lu(A), b); # x_perm is not updated to [0.16666666666666666, 0.0, 1.0]
# x_perm[permV] .= ldiv!(x_perm[permV], lu(A), b); # this works but I dont need to use it because it takes more allocations and time than ldiv!(x_perm[permV], lu(A), b))

skleinbo · January 19, 2023, 8:35pm

x_perm[permV]

makes a copy. Use

view(x_perm, permV)

instead. Edit: Doesn’t work with ldiv! as Dan points out below.

Dan · January 19, 2023, 8:47pm

As @skleinbo said, x_perm[permV] makes a copy, so is no good at mutating x_perm. However the view(x_perm, permV) route doesn’t work because ldiv! is optimized for specific Vector type as a destination (understandable from a performance point of view) and doesn’t take an AbstractVector, which would include a view.

This isn’t a problem, as you can send a regular vector and then use a view of that vector:

julia> ldiv!(x, lu(A), b);

julia> x_perm = view(x, permV)
3-element view(::Vector{Float64}, [3, 1, 2]) with eltype Float64:
 0.0
 1.0
 0.16666666666666666

You might need to use the inverse permutation to get the right view, with view(x, invperm(permV)) for example.

Amro · January 19, 2023, 9:14pm

Thanks a lot for your reply.
In my real code, the solution is divided over two parts (ldiv!) x_1&x_2 which I need to map them to x as below:

x[1] = x_1[2];
x[2] = x_2[2];
x[3] = x_1[1];
x[4] = x_2[1];

I tried the below but it did not work.

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
x_1 = zeros(2);
x_2 = zeros(2);
permV = [3, 1, 4, 2];
x[permV[1:2]] = view(x_1, [1,2]);
x[permV[3:4]] = view(x_2, [1,2]);
ldiv!(x_1, lu(A[1:2,1:2]), b[1:2]);
ldiv!(x_2, lu(A[3:4,3:4]), b[3:4]);
julia> x
4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

DNF · January 19, 2023, 10:45pm

x is just a regular Array containing plain data. It cannot hold a view into other arrays, let alone two distinct views. What would the data layout of x look like?

So the second line above just creates a view into x_1 and then copies the data from that view into x. Mutating x_1 after that has no effect on x.

Amro · January 20, 2023, 1:19am

Thanks for your details. I have two concerns please:
1- Is there a way in julia to build the below mapping between the three vectors x, x_1,x_2? So, when I change x_1 and x_2 x will be changed accordingly?

x[1] = x_1[2];
x[2] = x_2[2];
x[3] = x_1[1];
x[4] = x_2[1];

2- can I do in julia similar to the below c code?

int *pc, c;
c = 5;
pc = &c;

Dan · January 20, 2023, 2:14am

Slightly strugling to understand the calculation. Does the following get the result you need:

A1 = [1.0 2.0; 1.0 6.0]
A2 = [9.0 1.6; 4.0 1.0]
b1 = ones(2)
b2 = ones(2)
x1 = A1 \ b1  # same as ldiv! but returns result as new vector
x2 = A2 \ b2
x = [x1[2], x2[2], x1[1], x2[1]]  # put the elements in x

The result BTW is:

julia> x = [x1[2], x2[2], x1[1], x2[1]]
4-element Vector{Float64}:
  0.0
  1.9230769230769234
  1.0
 -0.23076923076923084

As for the other question… it is almost certain you wouldn’t need the equivalent in Julia. If you are traslating code from C to Julia, it is a good opportunity to clean it up from all the C dangers like pointer aliasing, and to try and adopt new Julian habits.

Amro · January 20, 2023, 2:41pm

Thank you for your reply.
Actually, I am trying to use ldiv! rather than \ to keep a good performance (avoid creating new vector) because the solution is repeated in a time-domain for-loop.

Sorry for the confusion. Simply, I want to get ride of the below error given that 1) I need to keep using ldiv! for performance 2) the result of ldiv! (x_1) is a permutated vector of another bigger-size vectorx.

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
permV = [3, 1, 4, 2];
x_1 = @view(x[permV[1:2]]);
ldiv!(x_1, lu(A[1:2,1:2]), b[1:2]);
ERROR: MethodError: no method matching ldiv!(::SuiteSparse.UMFPACK.UmfpackLU{Float64, Int64}, ::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false})

Thanks for your suggestion. Yes, I am still trying to adopt the new Julia habits to find alternatives for @view (which maps the locations of two arrays) that looks to work as pointers in C.

Dan · January 20, 2023, 2:50pm

IMHO the best way to optimize code using ‘Julia community’ is to post something running and then the community can benchmark and tinker with it. So, perhaps there is another way, not using ldiv! or \, but it is easier to figure this when we know what the calculation is supposed to do (for example, include the for loop outside).

Amro · January 20, 2023, 3:24pm

Thank you for your suggestion. Here is my MWE.

A = sparse([0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0; 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0; 0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0; 0 0  0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0; 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0;
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0; 0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0;
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0; 0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0; -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0; 0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0; 0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0; 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0;
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0; 0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0]);
b = ones(size(A,1))
x = zeros(length(b));
p = [1,2,3,11,12,10,4,5,6,7,8,9,13,14,15];
q = [1,2,3,11,12,10,13,14,15,7,8,9,4,5,6];
edge = 6;
b_1 = @view( b[p[begin:edge]] );
b_2 = @view( b[p[edge+1:end]] );
x_1 = @view( x[q[1:length(x)]] );
x_2 = @view( x[q[length(x)+1:length(x)]] );
timeSim = 0:0.0001:1;
Record_x = zeros(length(x),size(timeSim,1));
iter = 0;
for t in 0:0.0001:1
  global iter += 1;
  b .= b.*(1+t) .+ t;
  A.nzval .= A.nzval .*(1+t) .+ t;
  ldiv!(x_1, lu(A[p,q][begin:edge,begin:edge]), b_1);
  ldiv!(x_2, lu(A[p,q][edge+1:end,edge+1:end]), b_2);
  Record_x[:,iter] .= x;
end
ERROR: MethodError: no method matching ldiv!(::KLU.KLUFactorization{Float64, Int64}, ::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false})

DNF · January 20, 2023, 4:28pm

Could you include an actual for-loop to show what needs to happen inside of it? If you put a loop where you indicate it now, it would be strange, because there is no update part, just the exact same operation over and over.

I think your minimal example is too minimal.

~~(Yet, on the other hand, the part with the sparse matrix and klu seems superfluous, it isn’t relevant for the core of your question. That part chould be simplified.)~~

Amro · January 20, 2023, 6:09pm

Thanks for your suggestion. Please refer to the updated version based on your comment.

mikmoore · January 20, 2023, 7:03pm

If a primary concern is to avoid allocations in a loop, you could just preallocate a work vector, copy your values in, solve the problem, and copy them back where you want them. A vector-matrix solve is O(N^2) if the matrix if factorized (O(N^3) if it isn’t) and a vector copy is only O(N). This allows the optimized BLAS library to work on the view-less work vector and only costs a quick copy of the elements to the output vector you want.

To go back to one of your simpler examples:

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
permV = [3, 1, 4, 2];
workvec = similar(x, 2); # work vector with length=2
ldiv!(lu!(A[1:2,1:2]), copy!(workvec, @view b[1:2]));
x[@view permV[1:2]] = workvec;
# x[@view permV[1:2]] = ldiv!(lu!(A[1:2,1:2]), copy!(workvec, b[1:2])); # equivalent to above

On the other hand, BLAS does generally accept strided views (views over regular subindices). Something like ldiv!(lu(A), @view x[3:-1:1]) would likely perform virtually as well as view-less version. But from your more detailed example, it looks like that isn’t your case. You’ll be better off copying to/from a temporary vector.

Dan · January 20, 2023, 7:11pm

One of the main concerns with performance, is getting out of the global scope. That is, do all the calculations within a function and avoid using global variables. It is pretty useless to optimize before doing this.

As an example, which might not be exactly what you want, here is a version of your code within a function, and with some changes I think would promote performance (it may not work correctly now, though - you should check and/or adapt your code with ideas from this):

function docalc()
    A = sparse(
        [
            0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0
            0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0
            0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0
            0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0
            0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0
            0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0
            0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0
            -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0
            0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0
            0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0
            0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0
        ],
    )
    L = size(A, 1)
    b = ones(L)
    x = zeros(L)

    p = [1, 2, 3, 11, 12, 10, 4, 5, 6, 7, 8, 9, 13, 14, 15]
    q = [1, 2, 3, 11, 12, 10, 13, 14, 15, 7, 8, 9, 4, 5, 6]
    invq = invperm(q)
    edge = 6

    A1 = A[p, q][begin:edge, begin:edge]
    A2 = A[p, q][edge+1:end, edge+1:end]
    b1 = b[p[begin:edge]]
    b2 = b[p[edge+1:end]]
    x1 = similar(b1)
    x2 = similar(b2)

    timeSim = 0:0.1:1
    # timeSim = 0:0.0001:1
    record = zeros(L, length(timeSim))
    for (i, t) in enumerate(timeSim)
        b1 .= b1 .* (1 + t) .+ t
        b2 .= b2 .* (1 + t) .+ t
        nonzeros(A1) .= nonzeros(A1) .* (1+t) .+ t
        nonzeros(A2) .= nonzeros(A2) .* (1+t) .+ t

        ldiv!(x1, lu(A1), b1)
        ldiv!(x2, lu(A2), b2)
        record[invq[begin:edge], i] .= x1
        record[invq[edge+1:end], i] .= x2
    end
    return record
end

At one point it threw a Singular matrix error at me, but when doing a shorter loop, the function returned the record matrix okay.

Amro · January 20, 2023, 10:05pm

Actually, the performance is also my critical point.

Thanks for your helpful response and support!

If there is a way to avoid using the temporary vector given that b should not be changed by ldiv!?
I just want to confirm that pointers in Julia are not existed as in C, right? or at least cannot be used here?

Amro · January 20, 2023, 10:21pm

Dan:

        b1 .= b1 .* (1 + t) .+ t
        b2 .= b2 .* (1 + t) .+ t
        nonzeros(A1) .= nonzeros(A1) .* (1+t) .+ t
        nonzeros(A2) .= nonzeros(A2) .* (1+t) .+ t

        ldiv!(x1, lu(A1), b1)
        ldiv!(x2, lu(A2), b2)
        record[invq[begin:edge], i] .= x1
        record[invq[edge+1:end], i] .= x2

Thanks for your helpful response and support!
Actually, I am changing b rather than b1&b2 separately. The same thing with A. More over, in each iteration the value of x is needed to modify b. Is below is the only possible way given the performance is important?

function docalc()
    A = sparse(
        [
            0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0
            0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0
            0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0
            0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0
            0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0
            0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0
            0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0
            -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0
            0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0
            0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0
            0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0
        ],
    )
    L = size(A, 1)
    b = ones(L)
    x = zeros(L)
    p = [1, 2, 3, 11, 12, 10, 4, 5, 6, 7, 8, 9, 13, 14, 15]
    q = [1, 2, 3, 11, 12, 10, 13, 14, 15, 7, 8, 9, 4, 5, 6]
    invq = invperm(q)
    edge = 6
    b1 = @view( b[p[begin:edge]] );
    b2 = @view( b[p[edge+1:end]] );
    x1 = similar(b1)
    x2 = similar(b2)
    timeSim = 0:0.1:1
    # timeSim = 0:0.0001:1
    record = zeros(L, length(timeSim))
    for (i, t) in enumerate(timeSim)
        b .= b.*(1.+x) .+ t;
        nonzeros(A) .= nonzeros(A) .* (1+t) .+ t
        ldiv!(x1, lu(A[p,q][begin:edge,begin:edge]), b1)
        ldiv!(x2, lu(A[p,q][edge+1:end, edge+1:end]), b2)
        x[invq[begin:edge]] .= x1
        x[invq[edge+1:end]] .= x2
        record[:, i] .= x
    end
    return record
end
```

mikmoore · January 21, 2023, 12:25am

The suggestions from Dan are spot on. I’ll suggest you examine them closely.

There is virtually no intrinsic benefit to doing this as a single operation. You could do it element-by-element as a loop and make it just as fast, too.

Meanwhile, there are many unecessary allocations in your code. lu(A[p,q][begin:edge,begin:edge]) uses two more than necessary, alone (better as lu!(A[@view(p[begin:edge]), @view(q[begin:edge]) - note that views on indices of the form a:c or a:b:c are usually efficient). Your singular quest for specific microoptimizations is ignoring more glaring issues. Were you to benchmark both your and Dan’s code, I think theirs would perform much better (but follow their suggestion to verify it matches your functionality).

b not changing is not helpful here. ldiv!(LU,b) can overwrite b with the result and require no other vector period. ldiv!(x,LU,b) is very likely implemented as something like ldiv!(LU,copy!(x,b)) anyway. In any case, if not everything is strided then you won’t be able to use the optimized linear algebra libraries and performance will suffer more than any gain you’d expect to recover.
Pointers literally exist in Julia (as Ptr) but are rarely used (except for interfacing with C libraries). Ref is the Julia version of a pointer and is used sometimes, but I don’t think it’s of any use to you here.

julia> a = Ref(3)
Base.RefValue{Int64}(3)
julia> b = a
Base.RefValue{Int64}(3)
julia> a[] = 5
5
julia> b[]
5

All this said, the expensive part of this whole thing is going to be those lu() calls, especially if A gets bigger. What you’ve written could be done more optimally (Dan’s code is much closer), but ultimately I don’t expect it to make a huge difference. Also, at the current size and sparsity of your MWE, I would imagine making A a sparse matrix actually results in worse performance than a normal Matrix.

Amro · January 23, 2023, 3:53pm

Thank you very much for your explanation.

Thanks for this. I tried to make a referring to the first and third values of s. However, changing a[][2] is not changing s[3]. It seems that a = Ref([s[1],s[3]]) makes a references to new array [s[1],s[3]]. Is there a way to create a as needed (i.e., referring to the first and third values of s)?

julia> s=[9,7,4]
3-element Vector{Int64}:
 9
 7
 4

julia> a = Ref([s[1],s[3]])
Base.RefValue{Vector{Int64}}([9, 4])

julia> a[][2]=100
100

julia> s
3-element Vector{Int64}:
 9
 7
 4

mikmoore · January 23, 2023, 5:00pm

Ref wraps a value. [s[1],s[3]] creates a new array and Ref creates a reference to this new array (distinct from any of the old values). If you want to reference an index in an array, apply a second index argument to Ref as in

julia> x = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> b = Ref(x,3)
Base.RefArray{Int64, Vector{Int64}, Nothing}([1, 2, 3], 3, nothing)

julia> b[] = 8
8

julia> x
3-element Vector{Int64}:
 1
 2
 8

While this is technically different than a view to a single index, I don’t think it is in most of the ways that matter (except that a view creates a subarray out of a larger array – i.e. a view to a single index is a 0-dimensional array – whereas a Ref just creates some wrapped value of any type and it isn’t really considered an array at all except that it broadcasts like a 0-dim array).

So for your above example a = Ref(...), you just want a = view(s,[1,3]). You could jerry-rig this as a = [Ref(s,1),Ref(s,3)], but this is the same thing except much less ergonomic to work with.

The reason we almost never use “proper” pointers in Julia (and even Ref is only used in certain contexts) is that there are other constructs that allow us to accomplish the same things with similar efficiency. The same goes for other concepts and constructs used in other languages – Julia almost always has a way to accomplish the same goals with similar performance, although the solution may look rather different.

I’m getting the sense that you’re somewhat new to Julia (please forgive me if I’m mistaken!). Welcome! I’ll suggest that you spend some time writing Julia code without worrying quite so much about specific optimizations. The questions you’re asking here are very good, but some of them are challenging to answer well because it seems there are still some concepts and behaviors in Julia that you could benefit from understanding a little better. With every new language comes new syntax, new semantics, and new mental models of what does and doesn’t work well. When something doesn’t perform as well as you think it could/should, post the part of the code you’re working on and ask for advice (the better the MWE, the better people can help you) and you’ll likely get multiple people making excellent suggestions. You can learn from their advice (common advice has already been collected in the Performance Tips section of the manual) and you’ll quickly learn how to write efficient code from the start (and to predict when and where efficient code is actually important).

Amro · January 23, 2023, 5:39pm

Thank you very much for you explanation which is really helpful.
My last thing is that I want to update the content of a by simple syntax (not using a[1][], a[2][] = [500,200]), could you please guide me how to write its syntax?

s=[9,7,4];
a = [Ref(s,1),Ref(s,3)];
julia> a[:][] = [100,200]  
ERROR: BoundsError: attempt to access 2-element Vector{Base.RefArray{Int64, Vector{Int64}, Nothing}} at index []

Topic		Replies	Views
Non-allocating A / B where A and B are matrices General Usage	8	1744	March 7, 2019
ERROR: MethodError: no method matching lu! General Usage question	14	1097	December 16, 2022
Ldiv for Cholesky is slower than two substitutions Performance	13	244	July 2, 2025
Making efficient some small subroutines Performance	22	1132	May 8, 2020
Solve linear system repeatedly without allocation General Usage question , linearalgebra , memory-allocation	11	1195	July 9, 2022

Why is ldiv! not updating the first argument when it is written in permutated format?

Related topics