Why is ldiv! not updating the first argument when it is written in permutated format?

Hi all,
Is there any reason that x_perm is not updated in ldiv! similarly as in x?

A = sparse([1.0 0 0; 0 6 0; 1 0 9]);
b = ones(3);
x = zeros(3);
x_perm = zeros(3);
permV = [3, 1, 2]; # permV is an permutated vector
ldiv!(x, lu(A), b); # x is updated to [1.0, 0.16666666666666666, 0.0]
ldiv!(x_perm[permV], lu(A), b); # x_perm is not updated to [0.16666666666666666, 0.0, 1.0]
# x_perm[permV] .= ldiv!(x_perm[permV], lu(A), b); # this works but I dont need to use it because it takes more allocations and time than ldiv!(x_perm[permV], lu(A), b))
x_perm[permV]

makes a copy. Use

view(x_perm, permV)

instead. Edit: Doesn’t work with ldiv! as Dan points out below.

2 Likes

As @skleinbo said, x_perm[permV] makes a copy, so is no good at mutating x_perm. However the view(x_perm, permV) route doesn’t work because ldiv! is optimized for specific Vector type as a destination (understandable from a performance point of view) and doesn’t take an AbstractVector, which would include a view.

This isn’t a problem, as you can send a regular vector and then use a view of that vector:

julia> ldiv!(x, lu(A), b);

julia> x_perm = view(x, permV)
3-element view(::Vector{Float64}, [3, 1, 2]) with eltype Float64:
 0.0
 1.0
 0.16666666666666666

You might need to use the inverse permutation to get the right view, with view(x, invperm(permV)) for example.

3 Likes

Thanks a lot for your reply.
In my real code, the solution is divided over two parts (ldiv!) x_1&x_2 which I need to map them to x as below:

x[1] = x_1[2];
x[2] = x_2[2];
x[3] = x_1[1];
x[4] = x_2[1];

I tried the below but it did not work.

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
x_1 = zeros(2);
x_2 = zeros(2);
permV = [3, 1, 4, 2];
x[permV[1:2]] = view(x_1, [1,2]);
x[permV[3:4]] = view(x_2, [1,2]);
ldiv!(x_1, lu(A[1:2,1:2]), b[1:2]);
ldiv!(x_2, lu(A[3:4,3:4]), b[3:4]);
julia> x
4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

x is just a regular Array containing plain data. It cannot hold a view into other arrays, let alone two distinct views. What would the data layout of x look like?

So the second line above just creates a view into x_1 and then copies the data from that view into x. Mutating x_1 after that has no effect on x.

1 Like

Thanks for your details. I have two concerns please:
1- Is there a way in julia to build the below mapping between the three vectors x, x_1,x_2? So, when I change x_1 and x_2 x will be changed accordingly?

x[1] = x_1[2];
x[2] = x_2[2];
x[3] = x_1[1];
x[4] = x_2[1];

2- can I do in julia similar to the below c code?

int *pc, c;
c = 5;
pc = &c;

Slightly strugling to understand the calculation. Does the following get the result you need:

A1 = [1.0 2.0; 1.0 6.0]
A2 = [9.0 1.6; 4.0 1.0]
b1 = ones(2)
b2 = ones(2)
x1 = A1 \ b1  # same as ldiv! but returns result as new vector
x2 = A2 \ b2
x = [x1[2], x2[2], x1[1], x2[1]]  # put the elements in x

The result BTW is:

julia> x = [x1[2], x2[2], x1[1], x2[1]]
4-element Vector{Float64}:
  0.0
  1.9230769230769234
  1.0
 -0.23076923076923084

As for the other question… it is almost certain you wouldn’t need the equivalent in Julia. If you are traslating code from C to Julia, it is a good opportunity to clean it up from all the C dangers like pointer aliasing, and to try and adopt new Julian habits.

1 Like

Thank you for your reply.
Actually, I am trying to use ldiv! rather than \ to keep a good performance (avoid creating new vector) because the solution is repeated in a time-domain for-loop.

Sorry for the confusion. Simply, I want to get ride of the below error given that 1) I need to keep using ldiv! for performance 2) the result of ldiv! (x_1) is a permutated vector of another bigger-size vectorx.

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
permV = [3, 1, 4, 2];
x_1 = @view(x[permV[1:2]]);
ldiv!(x_1, lu(A[1:2,1:2]), b[1:2]);
ERROR: MethodError: no method matching ldiv!(::SuiteSparse.UMFPACK.UmfpackLU{Float64, Int64}, ::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false})

Thanks for your suggestion. Yes, I am still trying to adopt the new Julia habits to find alternatives for @view (which maps the locations of two arrays) that looks to work as pointers in C.

IMHO the best way to optimize code using ‘Julia community’ is to post something running and then the community can benchmark and tinker with it. So, perhaps there is another way, not using ldiv! or \, but it is easier to figure this when we know what the calculation is supposed to do (for example, include the for loop outside).

1 Like

Thank you for your suggestion. Here is my MWE.

A = sparse([0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0; 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0; 0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0; 0 0  0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0; 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0;
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0; 0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0;
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0; 0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0; -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0; 0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0; 0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0; 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0;
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0; 0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0]);
b = ones(size(A,1))
x = zeros(length(b));
p = [1,2,3,11,12,10,4,5,6,7,8,9,13,14,15];
q = [1,2,3,11,12,10,13,14,15,7,8,9,4,5,6];
edge = 6;
b_1 = @view( b[p[begin:edge]] );
b_2 = @view( b[p[edge+1:end]] );
x_1 = @view( x[q[1:length(x)]] );
x_2 = @view( x[q[length(x)+1:length(x)]] );
timeSim = 0:0.0001:1;
Record_x = zeros(length(x),size(timeSim,1));
iter = 0;
for t in 0:0.0001:1
  global iter += 1;
  b .= b.*(1+t) .+ t;
  A.nzval .= A.nzval .*(1+t) .+ t;
  ldiv!(x_1, lu(A[p,q][begin:edge,begin:edge]), b_1);
  ldiv!(x_2, lu(A[p,q][edge+1:end,edge+1:end]), b_2);
  Record_x[:,iter] .= x;
end
ERROR: MethodError: no method matching ldiv!(::KLU.KLUFactorization{Float64, Int64}, ::SubArray{Float64, 1, Vector{Float64}, Tuple{Vector{Int64}}, false})

Could you include an actual for-loop to show what needs to happen inside of it? If you put a loop where you indicate it now, it would be strange, because there is no update part, just the exact same operation over and over.

I think your minimal example is too minimal.

(Yet, on the other hand, the part with the sparse matrix and klu seems superfluous, it isn’t relevant for the core of your question. That part chould be simplified.)

1 Like

Thanks for your suggestion. Please refer to the updated version based on your comment.

If a primary concern is to avoid allocations in a loop, you could just preallocate a work vector, copy your values in, solve the problem, and copy them back where you want them. A vector-matrix solve is O(N^2) if the matrix if factorized (O(N^3) if it isn’t) and a vector copy is only O(N). This allows the optimized BLAS library to work on the view-less work vector and only costs a quick copy of the elements to the output vector you want.

To go back to one of your simpler examples:

A = sparse([1.0 2 0 0; 1 6 0 0; 0 0 9 1.6; 0 0 4 1]);
b = ones(4);
x = zeros(4);
permV = [3, 1, 4, 2];
workvec = similar(x, 2); # work vector with length=2
ldiv!(lu!(A[1:2,1:2]), copy!(workvec, @view b[1:2]));
x[@view permV[1:2]] = workvec;
# x[@view permV[1:2]] = ldiv!(lu!(A[1:2,1:2]), copy!(workvec, b[1:2])); # equivalent to above

On the other hand, BLAS does generally accept strided views (views over regular subindices). Something like ldiv!(lu(A), @view x[3:-1:1]) would likely perform virtually as well as view-less version. But from your more detailed example, it looks like that isn’t your case. You’ll be better off copying to/from a temporary vector.

1 Like

One of the main concerns with performance, is getting out of the global scope. That is, do all the calculations within a function and avoid using global variables. It is pretty useless to optimize before doing this.

As an example, which might not be exactly what you want, here is a version of your code within a function, and with some changes I think would promote performance (it may not work correctly now, though - you should check and/or adapt your code with ideas from this):

function docalc()
    A = sparse(
        [
            0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0
            0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0
            0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0
            0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0
            0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0
            0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0
            0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0
            -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0
            0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0
            0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0
            0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0
        ],
    )
    L = size(A, 1)
    b = ones(L)
    x = zeros(L)

    p = [1, 2, 3, 11, 12, 10, 4, 5, 6, 7, 8, 9, 13, 14, 15]
    q = [1, 2, 3, 11, 12, 10, 13, 14, 15, 7, 8, 9, 4, 5, 6]
    invq = invperm(q)
    edge = 6

    A1 = A[p, q][begin:edge, begin:edge]
    A2 = A[p, q][edge+1:end, edge+1:end]
    b1 = b[p[begin:edge]]
    b2 = b[p[edge+1:end]]
    x1 = similar(b1)
    x2 = similar(b2)

    timeSim = 0:0.1:1
    # timeSim = 0:0.0001:1
    record = zeros(L, length(timeSim))
    for (i, t) in enumerate(timeSim)
        b1 .= b1 .* (1 + t) .+ t
        b2 .= b2 .* (1 + t) .+ t
        nonzeros(A1) .= nonzeros(A1) .* (1+t) .+ t
        nonzeros(A2) .= nonzeros(A2) .* (1+t) .+ t

        ldiv!(x1, lu(A1), b1)
        ldiv!(x2, lu(A2), b2)
        record[invq[begin:edge], i] .= x1
        record[invq[edge+1:end], i] .= x2
    end
    return record
end

At one point it threw a Singular matrix error at me, but when doing a shorter loop, the function returned the record matrix okay.

2 Likes

Actually, the performance is also my critical point.

Thanks for your helpful response and support!

  • If there is a way to avoid using the temporary vector given that b should not be changed by ldiv!?
  • I just want to confirm that pointers in Julia are not existed as in C, right? or at least cannot be used here?

Thanks for your helpful response and support!
Actually, I am changing b rather than b1&b2 separately. The same thing with A. More over, in each iteration the value of x is needed to modify b. Is below is the only possible way given the performance is important?

function docalc()
    A = sparse(
        [
            0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0 0
            0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0 0
            0 0 0.01 0 0 0 0 0 0 0 0 -0.001 0 0 0
            0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0 0
            0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0 0
            0 0 0 0 0 1.0 0 0 -1.0 0 0 0 0 0 1.0
            0 0 0 -1.0 0 0 1.003 -0.0007 -0.0007 0 0 0 0 0 0
            0 0 0 0 -1.0 0 -0.0007 1.003 -0.0007 0 0 0 0 0 0
            0 0 0 0 0 -1.0 -0.0007 -0.0007 1.00302 0 0 0 0 0 0
            -0.001 0 0 0 0 0 0 0 0 0.004 -0.0007 -0.0007 0 0 0
            0 -0.001 0 0 0 0 0 0 0 -0.0007 0.004 -0.0007 0 0 0
            0 0 -0.001 0 0 0 0 0 0 -0.0007 -0.0007 0.004 0 0 0
            0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0
            0 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0
        ],
    )
    L = size(A, 1)
    b = ones(L)
    x = zeros(L)
    p = [1, 2, 3, 11, 12, 10, 4, 5, 6, 7, 8, 9, 13, 14, 15]
    q = [1, 2, 3, 11, 12, 10, 13, 14, 15, 7, 8, 9, 4, 5, 6]
    invq = invperm(q)
    edge = 6
    b1 = @view( b[p[begin:edge]] );
    b2 = @view( b[p[edge+1:end]] );
    x1 = similar(b1)
    x2 = similar(b2)
    timeSim = 0:0.1:1
    # timeSim = 0:0.0001:1
    record = zeros(L, length(timeSim))
    for (i, t) in enumerate(timeSim)
        b .= b.*(1.+x) .+ t;
        nonzeros(A) .= nonzeros(A) .* (1+t) .+ t
        ldiv!(x1, lu(A[p,q][begin:edge,begin:edge]), b1)
        ldiv!(x2, lu(A[p,q][edge+1:end, edge+1:end]), b2)
        x[invq[begin:edge]] .= x1
        x[invq[edge+1:end]] .= x2
        record[:, i] .= x
    end
    return record
end
​​​​​​​​​```
1 Like

The suggestions from Dan are spot on. I’ll suggest you examine them closely.

There is virtually no intrinsic benefit to doing this as a single operation. You could do it element-by-element as a loop and make it just as fast, too.

Meanwhile, there are many unecessary allocations in your code. lu(A[p,q][begin:edge,begin:edge]) uses two more than necessary, alone (better as lu!(A[@view(p[begin:edge]), @view(q[begin:edge]) - note that views on indices of the form a:c or a:b:c are usually efficient). Your singular quest for specific microoptimizations is ignoring more glaring issues. Were you to benchmark both your and Dan’s code, I think theirs would perform much better (but follow their suggestion to verify it matches your functionality).

  • b not changing is not helpful here. ldiv!(LU,b) can overwrite b with the result and require no other vector period. ldiv!(x,LU,b) is very likely implemented as something like ldiv!(LU,copy!(x,b)) anyway. In any case, if not everything is strided then you won’t be able to use the optimized linear algebra libraries and performance will suffer more than any gain you’d expect to recover.
  • Pointers literally exist in Julia (as Ptr) but are rarely used (except for interfacing with C libraries). Ref is the Julia version of a pointer and is used sometimes, but I don’t think it’s of any use to you here.
julia> a = Ref(3)
Base.RefValue{Int64}(3)
julia> b = a
Base.RefValue{Int64}(3)
julia> a[] = 5
5
julia> b[]
5

All this said, the expensive part of this whole thing is going to be those lu() calls, especially if A gets bigger. What you’ve written could be done more optimally (Dan’s code is much closer), but ultimately I don’t expect it to make a huge difference. Also, at the current size and sparsity of your MWE, I would imagine making A a sparse matrix actually results in worse performance than a normal Matrix.

1 Like

Thank you very much for your explanation.

Thanks for this. I tried to make a referring to the first and third values of s. However, changing a[][2] is not changing s[3]. It seems that a = Ref([s[1],s[3]]) makes a references to new array [s[1],s[3]]. Is there a way to create a as needed (i.e., referring to the first and third values of s)?

julia> s=[9,7,4]
3-element Vector{Int64}:
 9
 7
 4

julia> a = Ref([s[1],s[3]])
Base.RefValue{Vector{Int64}}([9, 4])

julia> a[][2]=100
100

julia> s
3-element Vector{Int64}:
 9
 7
 4

Ref wraps a value. [s[1],s[3]] creates a new array and Ref creates a reference to this new array (distinct from any of the old values). If you want to reference an index in an array, apply a second index argument to Ref as in

julia> x = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> b = Ref(x,3)
Base.RefArray{Int64, Vector{Int64}, Nothing}([1, 2, 3], 3, nothing)

julia> b[] = 8
8

julia> x
3-element Vector{Int64}:
 1
 2
 8

While this is technically different than a view to a single index, I don’t think it is in most of the ways that matter (except that a view creates a subarray out of a larger array – i.e. a view to a single index is a 0-dimensional array – whereas a Ref just creates some wrapped value of any type and it isn’t really considered an array at all except that it broadcasts like a 0-dim array).

So for your above example a = Ref(...), you just want a = view(s,[1,3]). You could jerry-rig this as a = [Ref(s,1),Ref(s,3)], but this is the same thing except much less ergonomic to work with.

The reason we almost never use “proper” pointers in Julia (and even Ref is only used in certain contexts) is that there are other constructs that allow us to accomplish the same things with similar efficiency. The same goes for other concepts and constructs used in other languages – Julia almost always has a way to accomplish the same goals with similar performance, although the solution may look rather different.


I’m getting the sense that you’re somewhat new to Julia (please forgive me if I’m mistaken!). Welcome! I’ll suggest that you spend some time writing Julia code without worrying quite so much about specific optimizations. The questions you’re asking here are very good, but some of them are challenging to answer well because it seems there are still some concepts and behaviors in Julia that you could benefit from understanding a little better. With every new language comes new syntax, new semantics, and new mental models of what does and doesn’t work well. When something doesn’t perform as well as you think it could/should, post the part of the code you’re working on and ask for advice (the better the MWE, the better people can help you) and you’ll likely get multiple people making excellent suggestions. You can learn from their advice (common advice has already been collected in the Performance Tips section of the manual) and you’ll quickly learn how to write efficient code from the start (and to predict when and where efficient code is actually important).

1 Like

Thank you very much for you explanation which is really helpful.
My last thing is that I want to update the content of a by simple syntax (not using a[1][], a[2][] = [500,200]), could you please guide me how to write its syntax?

s=[9,7,4];
a = [Ref(s,1),Ref(s,3)];
julia> a[:][] = [100,200]  
ERROR: BoundsError: attempt to access 2-element Vector{Base.RefArray{Int64, Vector{Int64}, Nothing}} at index []