ImageFiltering: Optimizing frame-by-frame image illumination flattening

I’m trying to speed up this image illumination flattening routine (i.e. this kind of thing) that highpasses each frame, then rescales the image back to UInt8. I’ve been trying out adding Threads.@threads before the for (with JULIA_NUM_THREADS set to 4), but experienced a few segfaults which may be linked to the large size of the vid variable, so I wondered if there’s a smarter way to optimize this.

For instance, can imfilter be used directly on video stacks? or is there a faster imfilter alternative that approximates a gaussian filter? or a direct highpass filter (rather than lowpass and divide)?

Unfortunately I don’t think I can use a common background image for all frames as the image moves around.

(I’ve been looking for fast global registration methods too, which if found could mean using a single background image, and only doing one lowpass)

using Images, ImageFiltering, StatsBase, Profile, ProfileView

function flatten(vid)
    gauss_kern = Kernel.gaussian(100)
    for i = 1:size(vid,3)
        img = vid[:,:,i] ./ imfilter(vid[:,:,i],gauss_kern)    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(img[:], 0.1), percentile(img[:], 99))    #rescaling image to percentile range
        vid[:,:,i] = round.(UInt8,(f.(img)*255))     #conversion back to UInt8
    end
end

vid = rand(UInt8,10,10,10)   #run once before @profile
flatten(vid);

vid = rand(UInt8,1280,1024,50)   #random noise frames for testing
@profile flatten(vid)

ProfileView.view()

While thinking about this, I’m looking at my iPhone’s iOS sliding unlock gradual blur transitions and wondering how it’s possible to optimize variable spatial frequency blur so much…

Each vid[:,:,i] is allocating memory. Try using @views and operate on vid in place by . /=

3 Likes

Yes, Gaussian filters can be approximated by recursive filters, which are faster if the kernel is large. As it happens the approximation also becomes better the larger the standard deviation is.

This is an enormous filter kernel and an ideal case for recursive filtering. Try to replace it with

gauss_kern = KernelFactors.IIRGaussian((100,100))

In my terminology subtracting lowpass would leave you with highpass and dividing by lowpass would give you something non-linear, but either way there’s no way to do it faster than going via the lowpass filter.

Unless you have a very simple motion and can detect it from only small parts of the image, you can’t do registration faster than a single recursive filter.

3 Likes

@GunnarFarneback’s answer is spot-on. But there’s one other point that you should make sure you understand:

This creates a dense 2d filter. But gaussians are separable, so if you’re interested in speed KernelFactors.gaussian((100,100)) would be approximately 50-fold faster. But of course even this will be beat handily by the IIRGaussian method mentioned above. Still worth understanding this, though, in case you decide you need something that is separable but doesn’t have a pre-made IIR approximation.

1 Like

I had planned to mention separability but couldn’t measure any significant difference when I tested. In fact imfilter does separability factorization internally by default for 2D filters, so you would only save the svd computation by giving the factors explicitly.

Good point, I’d forgotten I’d added that. For much of my image processing I live in the 3d world, where this automagic transformation doesn’t happen.

Further increase in speed can be gained by operating inplace. Below code introduces flatten2 which is the same as flatten but operates inplace where possible. IIRGaussian are used in both functions

using Images, ImageFiltering, StatsBase

function flatten(vid)
    gauss_kern = KernelFactors.IIRGaussian((100,100))
    for i = 1:size(vid,3)
        img = vid[:,:,i] ./ imfilter(vid[:,:,i],gauss_kern)    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(img[:], 0.1), percentile(img[:], 99))    #rescaling image to percentile range
        vid[:,:,i] = round.(UInt8,(f.(img)*255))     #conversion back to UInt8
    end
end


function flatten2(vid)
    gauss_kern = KernelFactors.IIRGaussian((100,100))
    img = Matrix{Float64}(undef, size(vid)[1:2])
    vimg = vec(img)
    for i = 1:size(vid,3)
        vidview = @view vid[:,:,i]
        imfilter!(img,vidview,gauss_kern)
        img .= vidview ./ img    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(vimg, 0.1), percentile(vimg, 99))    #rescaling image to percentile range
        vid[:,:,i] .= round.(UInt8, f.(img) .* 255)     #conversion back to UInt8
    end
end

vid0 = rand(UInt8,120,100,20)   #random noise frames for testing
flatten(vid0)
flatten2(vid0)

vid = rand(UInt8,1280,1024,50)   #random noise frames for testing
@time flatten(vid) # 5.561567 seconds (2.50 k allocations: 5.067 GiB, 24.31% gc time)
@time flatten2(vid) # 4.123926 seconds (1.66 k allocations: 1.964 GiB, 2.58% gc time)
# both timings after compilation 
 
1 Like

Thanks so much for the suggestions and elaborating. I’m going to get serious about utilizing @view and I wasn’t aware of vec, so I’ll be throwing out use of [:] where possible now!

I see a 4.7x speed improvement compared to original code and pretty sure I can apply a lot of the logic here throughout other functions.

Much appreciated!

using Images, ImageFiltering, StatsBase

function flatten0(vid)
    gauss_kern = Kernel.gaussian(100)
    for i = 1:size(vid,3)
        img = vid[:,:,i] ./ imfilter(vid[:,:,i],gauss_kern)    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(img[:], 0.1), percentile(img[:], 99))    #rescaling image to percentile range
        vid[:,:,i] = round.(UInt8,(f.(img)*255))     #conversion back to UInt8
    end
end


function flatten1(vid)
    gauss_kern = KernelFactors.IIRGaussian((100,100))
    for i = 1:size(vid,3)
        img = vid[:,:,i] ./ imfilter(vid[:,:,i],gauss_kern)    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(img[:], 0.1), percentile(img[:], 99))    #rescaling image to percentile range
        vid[:,:,i] = round.(UInt8,(f.(img)*255))     #conversion back to UInt8
    end
end

function flatten2(vid)
    gauss_kern = KernelFactors.IIRGaussian((100,100))
    img = Matrix{Float64}(undef, size(vid)[1:2])
    vimg = vec(img)
    for i = 1:size(vid,3)
        vidview = @view vid[:,:,i]
        imfilter!(img,vidview,gauss_kern)
        img .= vidview ./ img    #Division by the gaussian blur of itself i.e. highpass
        f = scaleminmax(percentile(vimg, 0.1), percentile(vimg, 99))    #rescaling image to percentile range
        vid[:,:,i] .= round.(UInt8, f.(img) .* 255)     #conversion back to UInt8
    end
end

vid = rand(UInt8,10,10,10)   #run once before @profile
flatten0(vid);
flatten1(vid);
flatten2(vid);

vid = rand(UInt8,1280,1024,50)   #random noise frames for testing
@time flatten0(vid); #31.297642 seconds (16.71 k allocations: 13.967 GiB, 37.40% gc time)
@time flatten1(vid); #9.725612 seconds (2.50 k allocations: 5.067 GiB, 29.76% gc time)
@time flatten2(vid); #6.705202 seconds (1.66 k allocations: 1.964 GiB, 9.37% gc time)