Is there a way to teach Zygote to derive Diagonal * Vector more efficiently?

marius311 · August 29, 2019, 12:22am

I’m still getting my head around reverse-mode diff, but is there a way to teach Zygote to do the derivative of Diagonal * Vector without allocating N^2 memory?

using LinearAlgebra, BenchmarkTools, Zygote

v = rand(4096)
D = Diagonal(v)

@btime gradient(α -> norm((α * D) * v), 1)
# 53.308 ms (32915 allocations: 129.41 MiB)

The 129.29MiB is basically the size of v*v' which appears to get computed into a dense matrix in one of the adjoints. However, if I rewrite the exact same operation slightly differently I can get:

@btime gradient(α -> norm((α * D).diag .* v), 1)
# 871.463 μs (32919 allocations: 1.41 MiB)

So in theory it appears possible. I tried adding something inspired by the adjoint rule for Vector .* Vector,

@adjoint *(x::Diagonal, y::Vector) = x.diag .* y,
  z̄ -> (unbroadcast(x, z̄ .* conj.(y)), unbroadcast(y, z̄ .* conj.(x)))

but this does not work (yields wrong answer, and memory consumption is the same).

Does anyone have any suggestions on if this is possible (seems it must be?), and if so, how to do it? Many thanks.

tkf · August 29, 2019, 12:48am

That’s probably because z̄ .* conj.(x) creates a matrix? This seems to work:

@adjoint *(x::Diagonal, y::Vector) = x.diag .* y,
    z̄ -> (Diagonal(unbroadcast(x.diag, z̄ .* conj.(y))), unbroadcast(y, z̄ .* conj.(x.diag)))

marius311 · August 29, 2019, 1:04am

Ah, I had messed it up a bit, your solutions makes sense, thanks!

Tamas_Papp · August 29, 2019, 6:02am

Please consider contributing this to Zygote.jl.

tkf · August 29, 2019, 6:07am

See:
https://github.com/FluxML/Zygote.jl/issues/316

Topic		Replies	Views
How do I customize the derivative of a matrix using Zygote: @adjoint General Usage	2	373	August 5, 2022
Innocent looking optimization of the forward pass causes performance cliff in gradient calculation with Zygote.jl v0.4.7 Machine Learning performance	5	940	February 5, 2020
Zygote @adjoint with matrices Machine Learning	7	1334	December 14, 2019
Difficulty in understanding Error from Zygote General Usage question	2	272	January 15, 2023
Zygote Performance Machine Learning question	22	4980	September 23, 2019

Is there a way to teach Zygote to derive Diagonal * Vector more efficiently?

Related topics