CatViews with Flux optimize!

lineycroc · January 17, 2022, 3:09pm

Flux’s update!(opt, x, x̄) function errors when x and x̄ are of type CatView with the following message:
TypeError: in typeassert, expected Tuple{CatView{1, Float64}, CatView{1, Float64}, Vector{Float64}}, got a value of type Tuple{Vector{Float64}, Vector{Float64}, Vector{Float64}}

Using the debugger, I think the problem stems from the x̄r = ArrayInterface.restructure(x, x̄) call at the beginning of update!, which re-types x̄ to be a Vector so its type no longer matches the CatView type of x.

Is there a way around this error where I can still use CatView inputs to update!? I could (as shown in code below) just copy x to a simple vector format (and then re-copy the results back to x afterward), but that’s ugly code and a (small, but annoying) waste of memory/compute. Maybe there is a way to over-write the restructure function to do nothing? That sounds like bad practice too though…

MWE:

import Flux # adam optimizer
using CatViews

x = [randn(2)]
dx = [randn(2)]
opt = Flux.ADAM()

# create CatViews of the variable and gradient 
xCV = CatView([@view x[k][:] for k=1:length(x)]...)
dxCV = CatView([@view dx[k][:] for k=1:length(x)]...)

tmp = copy(xCV) # tmp is a Vector while xCV is a CatView
Flux.Optimise.update!(opt, tmp, dxCV) # this works 

Flux.Optimise.update!(opt, xCV, dxCV) # this does not work

In terms of why I want to use CatView input: the input variables are actually a collection of OffsetArrays and the user can decide if they want to descend with respect to the OffsetArray values and/or other tuning parameters. Using CatViews allows me to update everything in place while letting Flux see the variables as a simple vector where the true structure is much more complicated.

Tomas_Pevny · January 17, 2022, 5:48pm

Cannot you use update the arrays inside CatView?
I think uodating CatViews would be very non-performant, as Catviews will calculate indexes of wrapped arrays for each access. For a nice API, you can just overload the update!

lineycroc · January 18, 2022, 4:12pm

I’m not sure if this is what you meant, but
[Flux.Optimise.update!(opt, x[i], dx[i]) for i=1:length(x)]
works (Flux.Optimise.update!(opt, x, dx) does not).

I’ll have to think more if that or overloading update! makes more sense long-term. Thanks for the ideas!

Tomas_Pevny · January 18, 2022, 5:04pm

This is sort of what I meant. If you look how getindex in CatViews is implemented, you will see that it would be very wasteful.
I would overload and you do not need a generator foreach(i -> Flux.Optimise.update!(opt, x[i], dx[i]), 1:length(x)]

darsnack · January 19, 2022, 3:35pm

The method in question is the default for dealing with non-standard array types. I think overloading would be appropriate here:

function Flux.Optimise.update!(opt, x::CatView, dx::CatView)
  foreach(i -> Flux.Optimiser.update!(opt, x[i], dx[i]), 1:length(x))
  return x
end

Topic		Replies	Views
ADAM crash New to Julia question , flux	5	407	January 21, 2024
Flux.update! not working with custom AbstractArray Machine Learning	1	194	December 12, 2023
Size Mismatch Convolution Layer Machine Learning question , flux	3	820	June 24, 2021
Zygote Update with Parametric Type Machine Learning	5	227	May 10, 2023
Flux, categorical arrays, roc curves, confusion matrices Machine Learning flux	14	1049	December 12, 2022

CatViews with Flux optimize!

Related topics