Greetings.
I have migrated most of my Machine Learning work from working in Pytorch for Flux and have not felt the need to look back(especially as I move towards SciML), however there was a small hiccup I’d like t inquire about:
Spectral normalization (SN) is a technique employed in Machine Learning, specifically Generative Adversarial Networks, to stabilize learning of the discriminator. As an exercise, I tried to generate my own Dense layer function with spectral normalization. However, I still find myself unable to train the network because my implementation of the spectral norm seems to not be differentiable using the Flux rendition of the gradient function, but shows no problem with the ForwardDiff gradient function.
Below is a simple code I wrote to illustrate this fact:
using Flux: gradient
using GenericLinearAlgebra: svdvals
function snorm(X)
return svdvals(X)[1]
end
X = rand(3,5)#Generating a random matrix to analyze.
println(snorm(X))#This part executes without issue.
println(gradient(snorm, X))#This part gives a "Can't differentiate foreigncall expression" error.
The stacktrace is as follows:
ERROR: Can't differentiate foreigncall expression
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] Pullback
@ /usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lapack.jl:1667 [inlined]
[3] (::typeof(∂(gesdd!)))(Δ::Tuple{Nothing, Zygote.OneElement{Float64, 1, Tuple{Int64}, Tuple{Base.OneTo{Int64}}}, Nothing})
@ Zygote ~/.julia/packages/Zygote/FPUm3/src/compiler/interface2.jl:0
[4] Pullback
@ /usr/share/julia/stdlib/v1.7/LinearAlgebra/src/svd.jl:211 [inlined]
[5] (::typeof(∂(svdvals!)))(Δ::Zygote.OneElement{Float64, 1, Tuple{Int64}, Tuple{Base.OneTo{Int64}}})
@ Zygote ~/.julia/packages/Zygote/FPUm3/src/compiler/interface2.jl:0
[6] Pullback
@ /usr/share/julia/stdlib/v1.7/LinearAlgebra/src/svd.jl:238 [inlined]
[7] (::typeof(∂(svdvals)))(Δ::Zygote.OneElement{Float64, 1, Tuple{Int64}, Tuple{Base.OneTo{Int64}}})
@ Zygote ~/.julia/packages/Zygote/FPUm3/src/compiler/interface2.jl:0
[8] Pullback
@ ./REPL[10]:2 [inlined]
[9] (::Zygote.var"#57#58"{typeof(∂(snorm))})(Δ::Float64)
@ Zygote ~/.julia/packages/Zygote/FPUm3/src/compiler/interface.jl:41
[10] gradient(f::Function, args::Matrix{Float64})
@ Zygote ~/.julia/packages/Zygote/FPUm3/src/compiler/interface.jl:76
[11] top-level scope
@ REPL[13]:1
[12] top-level scope
@ ~/.julia/packages/CUDA/bki2w/src/initialization.jl:52
However, if the exact same steps are taken using the gradient from ForwardDiff instead, there is no error reported.
using ForwardDiff: gradient
using GenericLinearAlgebra: svdvals
function snorm(X)
return svdvals(X)[1]
end
X = rand(3,5)#Generating a random matrix to analyze.
println(snorm(X))#This part executes without issue.
println(gradient(snorm, X))#This part does not give a "Can't differentiate foreigncall expression" error anymore.
My research machine is an Arch Linux using the Linux-bin package downloaded from the AUR, and otherwise runs flawlessly. The error is reproduced in my personal machine, which is a Mac OS system. Both Julia distributions are up-to-date, and the packages were also updated yesterday.
Any help, be it correcting some error I might have made or suggesting an alternative implementation that would work, is appreciated. This is my first time writing anything here, so please do point out if I did anything out of line.