How does broadcast! work ? Tips for memory efficiency while using 'kron'

amitjamadagni · May 17, 2018, 2:33pm

I was trying to use broadcast! in place of broadcast in the following code :

function _tensor_product_(st)
    ten = st[1]
    for s in st[2:end]
        ten = kron(ten, s)
    end
    return ten
end


@noinline function indi(arr, ind1, ind2, no_of_e)
    linit = [speye(2) for e=1:no_of_e]
    op = spzeros(2^no_of_e, 2^no_of_e)
    op1 = spzeros(2^no_of_e, 2^no_of_e)
    for elem in arr[ind1:ind2]
        println(" elem ", elem)
        for e in elem
            linit[e] = SparseMatrixCSC([0. 1.;1. 0.])
        end
        tp = _tensor_product_(linit)
        op += tp
        # op1 = broadcast(+, op1, tp)
        broadcast!(+, op1, op1, tp)
        # resetting the configuration for next iteration
        for i=1:length(linit)
            linit[i] = speye(2)
        end
        println(op1 == op)
        gc()
    end
    return op
end

r = [[1, 3, 6, 4], [2, 4, 7, 5], [6, 8, 1, 9], [7, 9, 2, 10]]
indi(r, 1, 4, 14)

The behavior of broadcast and broadcast! are not the same, while the former works as I expect but the latter does not replicate the former (was trying to do it in place).

Also, if I have the following the memory shoots up, one way to tackle is to use mmap and store it to disk as a tensor product is being performed. So how to use mmap in conjugation with sparse matrices and also are there any general methods to optimize the above routine for memory efficiency ?

# a = [[1, 3, 6, 4], [2, 4, 7, 5], [6, 8, 11, 9], [7, 9, 12, 10], [11, 13, 16, 14], [12, 14, 17, 15], [16, 18, 1, 19], [17, 19, 2, 20]]
# indi(a, 1, 4, 28)

Thanks !

kristoffer.carlsson · May 17, 2018, 2:55pm

Probably https://github.com/JuliaLang/julia/issues/21693. Should be fixed on master by https://github.com/JuliaLang/julia/pull/25890.

Topic		Replies	Views
Why is a multi-argument inplace map much faster in this case than a broadcast? Performance question , broadcast , map	16	674	December 12, 2022
Speeding up elementwise Vector-SparseMatrixCSC multiplication broadcasting Performance sparse	11	550	December 12, 2024
Design of efficient lazy broadcastable operator Performance	11	121	April 14, 2025
Understanding major order performance when broadcasting in column vs row operations Performance question , array , benchmark	9	1003	June 21, 2021
Blog post: Loop fusion and vectorization in Julia 0.6 Internals & Design announcement , broadcast	28	8401	May 4, 2017

How does broadcast! work ? Tips for memory efficiency while using 'kron'

Related topics