What's the status of image convolutions on CPU & GPU?

Just a quick note: I will put together a FFT-based implementation as soon as I find the time; I haven’t forgotten this topic. Also, it seems like it’s still not clear which package will ultimately collect all these implementations?