 # Sparse Feed Forward NN

Hi! I have a question about (hypothetical) sparse layers in feedforward neural neworks:

How would I go about using sparse array in place of W(eights) and b(iases) in the Flux’es Dense layer? What I imagine is that I have sparse inputs and sparse layers, and that I could obtain sparse gradients (either including gradients for zero weights or not) via backpropagation in Julia. I’ve tried to do the following (Julia 1.3.1, Flux 0.10.0):

``````using Flux
using SparseArrays
struct SprAffine{F,S,T}
W::S
b::T
σ::F
end
SprAffine(in::Integer, out::Integer, σ::Function) = SprAffine(sprandn(out, in, 0.5), sprandn(out, 0.5), σ)
(m::SprAffine)(x) = m.σ.(m.W * x .+ m.b)
Flux.@functor SprAffine
loss(x, y) = crossentropy(m(x), y)
m = Chain(SprAffine(4, 3, σ), SprAffine(3, 2, σ), softmax);
loss(sparse(1:4), sparse(1:2)) computes (and returns a Float64), but
gives me an error: `MethodError: no method matching zero(::Type{Tuple{Float64,Zygote.var"#916#back#380"{Zygote.var"#378#379"{Float64}}}})`.
Having downgraded to Flux v0.9.0, and using Tracker, I do get technically sparse gradients. They are filled with explicit zeros after start of tracking, but I can call `dropzeros!` by hand on all gradients, and they seem to remain sparse (i.e. without explicit zeros) after calling train!. Does Tracker really behave differently than Zygote when handling sparse weights, or am I just making some mistake?
Any explanations or directions would be appreciated! 