Max.(v1,v2) on the GPU

gianmariomanca · September 18, 2021, 11:31pm

What’s the best way to implement the element wise maximum of for 2 vectors of the same length on the GPU using CUDA.jl or any other package?

julia> v1 = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> v2 = [4, 3, 2]
3-element Vector{Int64}:
 4
 3
 2

julia> max.(v1,v2)
3-element Vector{Int64}:
 4
 3
 3

You can assume float32.

stillyslalom · September 19, 2021, 1:45am

It’s just as you’ve written it, except with device arrays:

julia> using CUDA

julia> v1 = cu(Float32[1, 2, 3])
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 1.0
 2.0
 3.0

julia> v2 = cu(Float32[4, 3, 2])
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 4.0
 3.0
 2.0

julia> max.(v1, v2)
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 4.0
 3.0
 3.0

gianmariomanca · September 19, 2021, 2:36am

Ahh… it was a Pluto “bug”:

I assumed the issue was the max operation but I guess it was Pluto trying to do some scalar indexing for IO?
Works fine in the REPL.

ToucheSir · September 19, 2021, 4:32am

Might be an instance of https://github.com/JuliaGPU/CUDA.jl/issues/875

maleadt · September 20, 2021, 8:11am

That’s probably Pluto.jl replacing the output stack, so GPUArrays.jl’s show methods (which first copy to the CPU as not to trigger scalar iteration) are not used.

gianmariomanca · September 30, 2021, 3:19am

@fonsp FYI

Topic		Replies	Views
Finding index and value of largest two elements in a CuArray GPU question , package , cuda	11	2269	January 5, 2021
Memory usage problem when using findmax/min GPU	9	866	December 29, 2022
How to vectorize any function on the GPU with CUDA.jl? GPU question , function	3	445	March 14, 2024
Function to obtain the maximum value of each element of an array with a second argument of zero? New to Julia question	1	62	October 3, 2024
Max of a vector New to Julia question	31	8522	February 24, 2025

Max.(v1,v2) on the GPU

Related topics