Vectorization over subset of indices

erlebach · December 25, 2022, 4:39am

I would like to vectorize some operation such as
a .= b .+ c
but only a subset of indices. For example, if they are both 4D arrays, I would like to vectorize only over dimensions 1 and 3. Is this possible,perhaps with views and other methods? Thanks!

ChrisRackauckas · December 25, 2022, 5:23am

@views @. a[:,:,:,i] = b[:,:,:,i] + c[:,:,:,i]

erlebach · December 25, 2022, 5:50am

Yes, but means you preallocated “a” and then you are mutatng, and Zygote will not work. But you did answer the question. I am also looking at TensorCast for ideas. I have a feeling that my problem can be solved at the cost of many allocatiins, which seems to be a consequence of non-mutation.

ChrisRackauckas · December 25, 2022, 5:53am

Look at using:

erlebach · December 25, 2022, 6:09am

Thanks! Will take a look.

Do you sleep?

ChrisRackauckas · December 25, 2022, 6:12am

No

erlebach · December 25, 2022, 5:45pm

Here is my solution, @ChrisRackauckas, using @tullio, which looks great! I could not find where differentiation rules were defined in Chainrules`, or perhaps they don’t have to be given only built in derivations are used?

using Zygote  # use before Tulio (see docs)
using Tullio

function test(N, degree)
    x = rand(N)
    y = rand(N)

    NT = NamedTuple{(:x,:y), Tuple{Int64, Int64}}
    I = Vector{NT}()
    for i in 0:degree
        for j in 0:degree
            if i + j < (degree+1)
                push!(I, (x=i, y=j))
            end
        end
    end

    coef = rand(length(I))

    Poly(coef) = @tullio poly[i] := coef[j] * x[i]^I[j].x * y[i]^I[j].y grad=Dual nograd=x nograd=y nograd=I
    return Poly, coef
end

N = 10_000
degree =20 
@time poly, coef = test(N, degree);
@time poly(coef)
loss = x -> sum(poly(x).^2)
@time Zygote.gradient(loss, coef)
println(length(coef))

For N=10_000 and degree 20 (231 terms in the polynomial), the gradient takes 0.02 sec on my MacBook Pro M1. Seems fast, but I do not have a sense of what is possible. The test functions takes 0.0002 seconds. My own problem requires degree 4 and about 200 points, which gives a Zygote timing of 0.0001.

Yes, I know I should use the benchmarking tools, shower, @time, when run 3-5 times in a row, is more than sufficient to provide an estimate, since I am interested only in orders of magnitude.

Cheers,

ChrisRackauckas · December 25, 2022, 5:46pm

Rules are already defined for Tullio:

github.com

mcabbott/Tullio.jl/blob/master/src/grad/zygote.jl


using .Zygote

Zygote.@adjoint function (ev::Eval)(args...)
    Z = ev.fwd(args...)
    Z, Δ -> begin
        isnothing(ev.rev) && error("no gradient definition here!")
        tuple(nothing, ev.rev(Δ, Z, args...)...)
    end
end

Tullio.promote_storage(::Type{T}, ::Type{F}) where {T, F<:Zygote.Fill} = T
Tullio.promote_storage(::Type{F}, ::Type{T}) where {T, F<:Zygote.Fill} = T

erlebach · December 25, 2022, 7:30pm

After all my changes, I still get the mutation error when taking derivatives with Zygote.
More precisely,

function main(tstate::Lux.Training.TrainState, vjp::Lux.Training.AbstractVJP, data::Tuple, epochs::Int)
    #data = data .|> gpu
    # no batches? 
    for epoch in 1:epochs
        # Try computing gradient of loss
        println(data)
        grads, loss, stats, tstate = Lux.Training.compute_gradients(vjp, loss_function, data, tstate)
        @info epoch=epoch loss=loss
        tstate = Lux.Training.apply_gradients(tstate, grads)
    end
    return tstate
end

I get a mutation error when calling Lux.Training.compute_gradients(tstate, grads).
So the question I have is: would you expect this to fail or succeed similarly to a direct call to Zygote.gradient(...)? I ask before the later call works. I will spend this afternoon on this.

Cheers,

Topic		Replies	Views
Speed of vectorized vs for-loops using Zygote Performance zygote , tullio	20	2251	June 1, 2020
Tullio and the nograd option General Usage	13	459	December 27, 2022
Array Contraction, LoopVectorization & AD Performance zygote , loopvectorization , autodiff	7	546	December 18, 2023
Zygote : Can't differentiate gc_preserve_end expression General Usage	9	746	August 18, 2020
Mutating versus non-mutating arrays for Zygote Gradient General Usage	6	347	December 25, 2022

Vectorization over subset of indices

Related topics