Flux bilinear upsampling

Hi, I cannot directly help with your problem, but maybe we can find another solution in separating CPU and GPU code. I have translated pytorch’s implementation of bilinear upsampling to julia and it seems to work more or less, even with fractional upsampling. I just dont have experience with zygote and adjoints. You can have a look at the implementation here and try it out with the following code. The forward pass looks good, but I havent checked the backward pass.

using FileIO
using ImageView  # ZZZzzzz
using Colors
using BenchmarkTools
f = download("https://upload.wikimedia.org/wikipedia/en/e/ed/Nyan_cat_250px_frame.PNG")
nyan = load(f)

imshow(nyan)

nyan_nchw = reshape(reinterpret(UInt8, nyan),1,3,250,250)
nyan_whcn = permutedims(nyan_nchw, [3,4,2,1]) .|> Float32
nyan_gpu = CuArray(nyan_whcn)

nyan_large = upsample_bilinear(nyan_gpu, pi, pi)

nyan_large_cpu = Array(nyan_large)[:,:,:,1]
nyan_colored = view(reinterpret(RGB{Float32}, permutedims(nyan_large_cpu, [3,1,2])),1,:,:)  # what a mess

imshow(nyan_colored/255)

# or plain arrays:
n = 64
s = 4
checkerboard = zeros(Float32, n, n)
for x in 1:2s:n-s
    for y in 1:2s:n-s
        checkerboard[y:y+s-1, x:x+s-1] .= 1
    end
end
imshow(checkerboard)

checkerboard_gpu = CuArray(reshape(checkerboard,n,n,1,1))

res = Array(upsample_bilinear(checkerboard_gpu, 2, 2))

imshow(res[:,:,1,1])
imshow(round.(res[:,:,1,1]))

Maybe we you can compare this to your results and have a look at the backward pass.

EDIT: I just saw this, which also looks interesting!