Data augmentation question (CIFAR10)

Rasmus_Hoier · October 13, 2021, 8:23am

Hi,
I am training a CNN on CIFAR10 (via Flux.DataLoader) and am trying to apply some stochastic data transformations to each batch of data at runtime.
My goal is to apply the following operations to each batch I get from the dataloader.

pad with zeros (yielding 32x32 → 40x40 images).
Random crop (yielding 40x40 → 32x32 images).
Flip images horizontally with 50% probability.

Is there a best practice for achieving this in Julia?
Augmentor.jl seems like it might be the way to go, but it requires me to convert between dataformats, and does not seem to have an option for padding.
I found an example on the dev version of Augmentor.jl’s docs (link), which relies on MappedArrays instead of Flux.DataLoader. This seems less readable and as far as I can tell data is not shuffled at every epoch.

Using Julia and Flux has generally been a breeze so far, with custom layers and learning rules being very simple to implement. So I was quite surprised that implementing standard data augmentation seems to take much more effort.

I have tried to implement a minimal working example shown below. The interesting parts are probably the function MWE and getdata. I did not find a good way to implement padding (and to random crop to size 32x32 I need padding), so the only transformation applied at the moment is FlipX(0.5).

I guess my questions are:

Am I on the right track with using Augmentor.jl or are there better options?
If Augmentor.jl is the way to go, then how could I implement the padding and random cropping?
Do you have general ideas on how to make things cleaner/faster? For larger networks my current approach slows things down a bit.

using Augmentor, MLDatasets
using Flux, Flux.Optimise
using Flux: onehotbatch, onecold
using Flux.Losses: logitcrossentropy

function getdata(batchsize)
    xtrain, ytrain = MLDatasets.CIFAR10.traindata(Float32)
    xtest, ytest = MLDatasets.CIFAR10.testdata(Float32)

    m = reshape([0.4914009f0 0.4914009f0 0.4465309f0], (1,1,3,1))
    s = reshape([0.20230277f0 0.19941312f0 0.2009607f0], (1,1,3,1))
    xtrain = (xtrain .- m) ./ s
    xtest = (xtest .- m) ./ s

    # Convert training data to RGB to work with augmentbatch!()
    xtrain = MLDatasets.CIFAR10._colorview(RGB, permutedims(xtrain, (3, 1, 2, 4)))
    ytrain, ytest = Flux.onehotbatch(ytrain, 0:9), Flux.onehotbatch(ytest, 0:9)

    trainloader = Flux.DataLoader((xtrain, ytrain), batchsize=batchsize, shuffle=true, partial=false)
    testloader = Flux.DataLoader((xtest, ytest), batchsize=batchsize, partial=false)

    return (trainloader, testloader)
end

function LeNet5(; imgsize=(28,28,1), nclasses=10) 
    out_conv_size = (imgsize[1]÷4 - 3, imgsize[2]÷4 - 3, 16)
    return Chain(
            Conv((5, 5), imgsize[end]=>6, relu),
            MaxPool((2, 2)),
            Conv((5, 5), 6=>16, relu),
            MaxPool((2, 2)),
            flatten,
            Dense(prod(out_conv_size), 120, relu), 
            Dense(120, 84, relu), 
            Dense(84, nclasses)
          )
end

function loss_and_accuracy(data_loader, net, device)
    acc = 0.0f0; ls = 0.0f0; num = 0
    for (x, y) in data_loader
        x, y = x |> device, y |> device
        pred = net(x)
        ls += logitcrossentropy(pred, y)
        acc += sum(onecold(cpu(pred)) .== onecold(cpu(y)))
        num +=  size(y, 2)
    end
    return ls / num, acc / num
end

function MWE()
    pl = FlipX(0.5) |> SplitChannels() |> PermuteDims((2, 3, 1))
    device = gpu
    batchsize = 128
    trainloader, testloader = getdata(batchsize)
    opt = ADAM(0.0001)
    net = LeNet5(imgsize=(32, 32, 3), nclasses=10) 
    net = net |> device
    ps = Flux.params(net)
    for epoch=1:5
        for (x, y) in trainloader
            xaug = zeros(Float32, 32, 32, 3, batchsize)
            augmentbatch!(xaug, x, pl)
            xaug, y = xaug |> device, y |> device
            gs = gradient(ps) do
                l = logitcrossentropy(net(xaug), y)
            end
            update!(opt, ps, gs)
        end
        test_loss, test_acc = loss_and_accuracy(testloader, net, device)
        @info """Epoch: $epoch:
        Test:     Acc(θ): $(round(test_acc*100f0, digits=2))%    Loss: $(round(test_loss, digits=6))
        """
    end
end

MWE()

HenriDeh · October 13, 2021, 9:46am

For cropping, I’d use Augmentor.Crop as well. You’d only have to implement the randomization of the indices.

I did not know padding was considered an image augmentation. It’s usually done by the Conv layer using the keyword argument pad = 4.

There’s a difference between the two approaches, yours would yield image patches with pad corners, while using the Conv padding creates a “square” of zeros around your cropped patches. I think the latter is the standard approach. But I might be wrong.

CarloLucibello · October 13, 2021, 10:05am

You should try https://github.com/lorenzoh/DataAugmentation.jl

Rasmus_Hoier · October 13, 2021, 11:02am

Thanks for the suggestions:)

@HenriDeh
The motivation for padding at the augmentation stage is that the randomly cropped images then will have size 32x32 (like the original data), but the network will see each image slightly differently displaced at each epoch.
This is a common and effective augmentation technique, which for example is described at the top of page 8 in this paper.
In pytorch this is achieved by applying RandomCrop with keyword padding.
Augmentor does have a function RCropSize, but no option for padding. I guess a slightly clumbsy solution could be to add padding to the training data when creating the dataloaders.

@CarloLucibello
Thanks for the suggestion. I initially discounted this package as it looked a bit less stable than Augmentor.jl, and I couldn’t find out how to use it in conjunction with batched data (where augmentor has Augmentor.augmentbatch!()).
Could you elaborate on why you would recommend this package over Augmentor.jl?
Also all the examples I could find are for individual images. Is there a function for applying the same transform to a batch of images?

CarloLucibello · October 13, 2021, 11:12am

I’ve never used DataAugmentation.jl myself so I can’t give much advice, but it’s part of the larger https://github.com/FluxML/FastAI.jl project, so it’s being actively developed (while Augmentor.jl is essentially in maintenance mode AFAIK) and it is geared towards deep learning needs

CarloLucibello · October 13, 2021, 11:15am

You could file an issue to DataAugmentation.jl reporting specific needs or lack of documentation, I think it would be useful

HenriDeh · October 13, 2021, 12:00pm

You’re welcome. In that case you could open an issue in Augmentor.jl to ask for the feature. Or better, if you feel up to it, you could implement it and make a pull request ! It’s certainly a good idea to get some practice in julia programming.

Rasmus_Hoier · October 13, 2021, 10:12pm

Thanks
Reading issues in both repos have given me some ideas on how I could bake transformations into a custom dataloader, which should help make my code cleaner. I will look more into DataAugmentation.jl and try to figure out if it is better for my use case than Augmentor.jl.

CarloLucibello · October 17, 2021, 5:51am

Great, let us know the solution you end up with, it would be nice to add augmentation to the model zoo script https://github.com/FluxML/model-zoo/blob/master/vision/vgg_cifar10/vgg_cifar10.jl

DrChainsaw · October 17, 2021, 9:36am

Fwiw, I have used PaddedViews in conjunction with Augmentors RCropSize for this type of augmentation.

Iterator for reference

struct AugIter{A,B}
    aug::A
    base::B
end

function Base.iterate(itr::AugIter)
    valstate = iterate(itr.base)
    valstate === nothing && return nothing
    val, state = valstate
    buffer = initbuffer(val, itr.aug)
    return featureaug(val, buffer, itr.aug), (buffer, state)
end

initbuffer((x,y)::Tuple, aug) = initbuffer(x, aug)
function initbuffer(val::AbstractArray, aug) 
    img1 = augment(val[:,:,1], aug)
    return similar(img1, size(img1)..., size(val)[end])
end

function Base.iterate(itr::AugIter, (buffer, state))
    valstate = iterate(itr.base, state)
    valstate === nothing && return nothing
    val, state = valstate
    return featureaug(val, buffer, itr.aug), (buffer, state)
end

featureaug((x,y)::Tuple, buffer, aug) = featureaug(x, buffer, aug), y
function featureaug(val, buffer, aug)
    nobs_val = size(val)[end]
    nobs_buf = size(buffer)[end]

    bview = if  nobs_val < nobs_buf 
        selectdim(buffer, ndims(buffer), 1:nobs_val)
    elseif nobs_val > nobs_buf
        similar(buffer, size(buffer)[1:end-1]..., size(val)[end])
    else
        buffer
    end
    return img2arr(augmentbatch!(CPUThreads(), bview, val, aug))
end

img2arr((x,y)::Tuple) = img2arr(x), y
function img2arr(img::AbstractArray)
    chwn = ImageCore.channelview(img)
    return PermutedDimsArray(chwn, (3,2,1,4))
end

Created like this:

 AugIter(FlipX(0.5) |> RCropSize(32, 32), paddedbatchiter)

Where paddedbatchiter is an interator returning tuple of batches of a PaddedView of the training data and the corresponding labels (e.g. Fluxs DataLoader).

Topic		Replies	Views
ANN: Augmentor.jl - Image Augmentation for Deep Learning Community package , images , tensorflow	0	1986	September 29, 2017
Simple Model for CIFAR-10 using Flux not converging Machine Learning flux	1	196	March 18, 2023
Accuracy issues on Flux Performance question , flux	25	1189	January 3, 2023
FlexLayer: A Custom Layer with Different Activation Fcns, Non-negativity, and more New to Julia flux	0	561	July 3, 2020
How to implement custom image dataset for cnn in Julia Flux? Machine Learning	6	1021	January 9, 2023

Data augmentation question (CIFAR10)

Related topics