I am trying to understand how to use Enzyme. I run the following code:
using Enzyme
abstract type Layer end
mutable struct parameter
p::Array
end
mutable struct Final<:Layer
A::parameter
end
v=Float32.(randn((5,5,5,5)))
input=Float32.(randn((5,5,5,5)))
para=parameter(v)
FL1=Final(para)
function Valuation1(z,Ls1::Array{Float32})
Ls1[1]=sum(z.A.p.*input)
Ls=Ls1[1]
return Ls
end
FL2=deepcopy(FL1)
dzdp=deepcopy(FL1)
dzdp.A.p.=Float32(0)
Enzyme.autodiff(Reverse,Valuation1,Active,Duplicated(FL2,dzdp),Duplicated(Vector{Float32}(undef,1),Vector{Float32}(undef,1)))
println("dzdp",dzdp.A.p[1,1,1,1])
function Valuation2(z,Ls1::Array{Float32})
return sum(z.A.p.*input)
end
FL2=deepcopy(FL1)
dzdp=deepcopy(FL1)
dzdp.A.p.=Float32(0)
Enzyme.autodiff(Reverse,Valuation2,Active,Duplicated(FL2,dzdp),Duplicated(Vector{Float32}(undef,1),Vector{Float32}(undef,1)))
println("dzdp",dzdp.A.p[1,1,1,1])
The first println gives 0. The second one does not. Why does the first println give 0?
Thank you
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 12 Ă— Apple M2 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
Threads: 1 on 8 virtual cores
Glad you were able to make it work! As a side note, your structs use abstract types for their fields (eg. Array instead of Array{T,N} with parametric T and N), which is a common caveat for performance. It may not be relevant for your use case, but if you go further, do check out the “performance tips” in the Julia manual to learn more!
After setting the shadow buffers to zero I still get the 0 answer. I submitted an issue on Github. I hope I made it consize enough: I don’t have much experience with Julia or submitting GitHub issues.
If I understand you correctly by submitting the temporary array as an input array and making it duplicated enzyme will now treat this as an other output of the model. I would have to set the temporary array back to some arbitrary number at the end of the calc to cut this link but it probably is nicer to set the shadow buffer to 0.