Zygote/Flux not computing gradient with view/slice of tensor properly for MLP

outlace · December 4, 2021, 3:47pm

Well there is still more fiddling to be done to maybe improve its performance. For example, I wonder if manually computing gradients would be faster as Zygote has a hard time with the matrix subsampling. I am also primarily interested in implementing SLIDE for computer vision applications, so I can do training and inference on a CPU. I was planning on implementing one of the new MLP-based vision networks like MLP-Mixer ( https://papers.nips.cc/paper/2021/file/cba0a4ee5ccd02fda0fe3f9a3e7b89fe-Paper.pdf ) using SLIDE to see if that would work.

I uploaded my experiments to this git repo: GitHub - outlace/SLIDE-Pose-Estimation

So I was able to get Zygote to get gradients but Flux’s built in optimizer doesn’t seem to work with my SLIDE implementation, so I had to write an ADAM optimizer from scratch, that’s in the optim.jl file.

Topic		Replies	Views
Will Flux/Zygote compute gradients sparsely? Machine Learning	7	542	September 4, 2021
Flux.params of a matrix implemented as a struct Machine Learning zygote	11	978	May 17, 2021
Innocent looking optimization of the forward pass causes performance cliff in gradient calculation with Zygote.jl v0.4.7 Machine Learning performance	5	939	February 5, 2020
Sparse Feed Forward NN Machine Learning flux , arrays , zygote	3	1320	July 25, 2021
Zygote: Mutating arrays is not supported General Usage	3	522	August 5, 2020

Zygote/Flux not computing gradient with view/slice of tensor properly for MLP

Related topics