Inconsistent behaviour using Zygote.Buffer

I’m working on making a structural analysis code base work smoothly with Automatic Differentiation, specifically Zygote for its generally friendly interface. In short, a major performance bottleneck is the assembly of the global stiffness matrix K for an eventual linear solve to determine structural deformation.

To make use of constant-once-first-assembled sparsity pattern of the matrix, i’m trying to make each iteration simply generate a new non-zero value vector K.nzval, after which an updated sparse matrix can be more efficiently generated by reusing the original K.colptr, K.rowval, etc.

I’m making use of Zygote Buffers to make array mutation possible in the creation of the new nzval vector. However, I’m finding that initiating a new buffer has inconsistent values for Buffer.data, which affects the final outcome. Creating a buffer from a zero’d vector does not have all zeros for Buffer.data; generating a buffer from the original nzval vector also does not maintain the same values.

Am I misusing these buffers in some way? Or are there any suggestions to efficiently reassemble large sparse matrices for automatic differentiation?

Here’s a minimum working example with nonsense values, but the principle is the same:

using SparseArrays, Zygote

#initial sparse matrix
M = sprand(5000, 5000, .01)

#reuse the same sparsity pattern
m, n = size(M)
colp = M.colptr
rowv = M.rowval
nzv = M.nzval

#dummy indices that point to positions in S.nzval that need updating
n_updates = 1000
dim_update = 6
nzindices = [rand(1:length(nzv), dim_update) for _ in 1:n_updates]
updatevals = [rand(dim_update) for _ in 1:n_updates]

#without buffers (CANNOT BE AUTO-DIFFERENTIATED BUT CONSISTENT)
nznew = zero(nzv)
for (index, value) in zip(nzindices, updatevals)
    nznew[index] .+= value
end
newS_nobuffer = SparseMatrixCSC(m, n, colp, rowv, nznew)

#with buffers (CAN BE AD'd, BUT INCONSISTENT)
nzbuff = Zygote.Buffer(zero(nzv))
for (index, value) in zip(nzindices, updatevals)
    nzbuff[index] .+= value
end
newS_buffer = SparseMatrixCSC(m, n, colp, rowv, copy(nzbuff))

#compare the two
norm(newS_nobuffer .- newS_buffer)

# this will give a different value each time
for _ = 1:100
    nzbuff = Zygote.Buffer(zero(nzv))
    for (index, value) in zip(nzindices, updatevals)
        nzbuff[index] .+= value
    end

    newS_buffer = SparseMatrixCSC(m, n, colp, rowv, copy(nzbuff))

    norm(newS_nobuffer .- newS_buffer)
end

FYI,

I decided to approach this the “right” way and just wrote proper adjoint rules to bypass the buffers. However, I’m still curious as to its inconsistent behaviour!

Zygote.Buffer is like similar, it allocates a new uninitialized array with the same type and dimensions as the argument. Once you create the Buffer, all entries must be initialized explicitly, it does not copy data from the argument.