First off, I want to thank the maintainers of JuliaGPU for the unbelievable work they’re doing. Even for someone like me who’s just beginning to write basic GPU kernels, it’s been easier and more illuminating to do so in Julia (which is surreal).
That being said, for newcomers (like me) who might want access to GPU functions without the need to write custom kernels, it would be great to have some generic implementations of commonly used functions/operations in a single location. This would be similar to how CuPy has many NumPy and SciPy routines available. In particular, things like computing convolutions and histograms would be great. These have also been asked for in the past (see for example 1, 2, 3).