Best approach to smooth array? None of my approaches are very useful




I´ve been trying this week to get something accomplished but I cannot get it done.

Problem: I have group of variables coming from a NetCDF. When imported using NCDatasets, I get arrays of 64 bits with a size (128,128,80,2400).
A code I was using (in matlab) has a line of the style movemean(moveman(movmean(array,window_i),window_j),window_l)
Which performs a smoothing in the 1st, 2nd, and 4th dimensions.
To do this in Julia, I have tried different approaches:

First approach: using Images.ImageFiltering with the kernel:

 function kernel4d(window_h,window_t)
    kern = ones(window_h,window_h,1,window_t)/(window_t*window_h*window_h);

And passing it to imfilter like this:

function filter_array(array,smooth_x,smooth_time)
    filtered = similar(array)
        imfilter!(filtered,array, kernel4d(smooth_x,smooth_time))
    return filtered

Second approach: using Images.ImageFiltering with mapwindow:

function filter_array2(array,smooth_x,smooth_time)
        filtered = mapwindow(mean,mapwindow(mean,mapwindow(mean,array, (smooth_x,1,1,1)),(1,smooth_x,1,1)),(1,1,1,smooth_time))

Third approach: using Satistics.mean

function filter_array3(array,windowh,windowt)
    filtered = similar(array)
    fach = div((windowh + 1),2)
    fact = div((windowt + 1),2)
    xend,yend,zend,tend = size(array);
     for I in CartesianIndices(array)
       filtered[I] = mean(array[max(1,I[1]-fach):min(xend,I[1]+fach),I[2],I[3],I[4]]);
     for I in CartesianIndices(array)
        filtered[I] = mean(filtered[[1],max(1,I[2]-fach):min(yend,I[2]+fach),I[3],I[4]]);
     for I in CartesianIndices(array)
        filtered[I] = mean(filtered[I[1],I[2],I[3],max(1,I[4]-fact):min(tend,I[4]+fact)]);
    return filtered

Although these approaches are not strictly equivalent (some take into account some padding in the borders), they are significantly slower than the matlab one line code (10x). My code has two major bottlenecks file reading (I don’t really know how to improve this) and this smoothing function. The window sizes are small in the first and second dimensions (around 5 elements) but for the fourth it is around 30 elements.

The first approach also uses impressive amounts of memory (it breaks with an out of memory error in a system with 50Gb).

The second approach has a problem: the second and the third loops are increasingly expensive due to accesing elements very far in memory.

I must say I call these process 8 times in a row for different variables. And I am using Julia 0.7.0

Is there something you could tell me about this?



Have you tried factored kernels: Also, potentially it could be worth moving the last dim to be the first as that would lead to better memory locality.


Is approach 3 doing the right thing? You’ve got 3 for loops, each overwriting the previous one. Also in the second loop you have [1], instead of I[1], which might be doing something weird (not sure, don’t have Julia in front of me to test).


Thanks for noticing that. Indeed the result was wrong but fixing it didn’t help much. The thing about the three loops is what I want, as I want to filter the same array in three of its dimensions.


Hey mauro3, thanks for your answer:

I’ve tried Factored Kernels in this way, I am not so sure if this is the correct one, though.

function kernel4d_2(window_h,window_t)
    kern1 = centered(ones(window_h,1,1,1)/(window_h));
    kern2 = centered(ones(1,window_h,1,1)/(window_h));
    kern3 = centered(ones(1,1,1,window_t)/(window_t));
    kernz = centered(ones(1,1,1,1))

    return kernelfactors((kern1,kern2,kernz,kern3))

Is this the correct way to use it, and should I expect it to be better than using the 4-d Kernel directly?


If it does any decimation to the output afterwards, it would probably be worth trying to build that in to the filter. The x2400 by 30 dimension seems like a good candidate for a stride of 2 or more, but it depends on how you want to use the output

Add: there maybe something in BlasPack or the like that can help?


You can use it that way but this also works:

julia> function kernel4d(window_h, window_t)
           kernel_h = ones(window_h)/window_h
           kernel_t = ones(window_t)/window_t
           return kernelfactors((kernel_h, kernel_h, [1.0], kernel_t))

(see ?kernelfactors and the part about being able to pass vectors). To explain what you’re seeing, try ?KernelFactors.ReshapedOneD and look at the individual elements of the return value.

and should I expect it to be better than using the 4-d Kernel directly?

That’s explained in the docs that @mauro3 linked:

If the kernel is of size m×n , then the upper version line requires mn operations for each point of filtered , whereas the lower version requires m+n operations. Especially when m and n are larger, this can result in a substantial savings.

In your case it’s the difference between 5 * 5 * 30 = 750 and 5 + 5 + 30 = 40. This therefore explains the factor of 10 difference with Matlab: notice that your Matlab code is using a factored kernel: window_i, window_j, and window_l are separate factors of the kernel. So you were asking Julia to use a fundamentally worse (more computationally demanding) approach than you were asking Matlab.

For even greater speed see


Thanks for the suggestion! I’m gonna try it making that if everything else fails. Initially all the data is kept.


Thanks a lot Tim!
The suggested new form of the call to kernel factors causes a surprisingly long error message. I will stay with the old one until I can figure out what is happening.
I let you the message in case is interesting/useful. I am using julia 0.7, would it be useful to start an issue? This only happens apparently when using “export JULIA_NUM_THREADS=N” where N is greater than 1

The error line message too long, so I pasted it in pastebin:


As I think you realized, that’s a Julia bug, not an ImageFiltering bug. For me it works on Julia 1.0.2 (and I can reproduce the error on Julia 0.7), so perhaps it might be worth trying on a newer version of Julia. If you can reproduce it on the latest release, it’s definitely worth reporting as a bug.


I just tried Julia 1.0.2 and works flawlessly.
Thanks so much for all the help!


Hi Tim:

Are this two things supposed to give the same output?

a = rand(10,10,10)
b = similar(a)
for i in 1:10
       b[:,:,i] = imfilter(a[:,:,i], kernelfactors(( ones(3)/3, ones(3)/3)))
c = imfilter(a,kernelfactors(( ones(3)/3 , ones(3)/3) , [1.0])) )

The question is,
shouldn't b==c?
It doesn’t, but I don’t understand why.


It’s a bug, I filed an issue for you:


Just updating this in case someone finds this post and uses it as a reference.

The correct form of smoothing in the three directions without displacing the array is:

function kernel4d(window_h, window_t)
           kernel_h = ones(window_h)/window_h
           kernel_t = ones(window_t)/window_t
           return kernelfactors((centered(kernel_h), centered(kernel_h), centered([1.0]), centered(kernel_t)))


The correct form of smoothing in the three directions without displacing the array is

It’s not just three directions, you want to use centered even in 1d or 2d if you are using a kernel that is symmetric around 0. The original form shifted the array along all 3 axes, not just the 3rd axis.

(The behavior above turned out not to be a bug, it was a feature.)


Right, I was just trying to make it a final answer for the initial question of the post :smiley: