Hi,

Before stating my problem, I would like to say that I’m a fully enthusiastic and convinced Julia user, and to thank the developers for their awesome work.

The CUDA module is especially of great help to write fast code executed on the GPU without (most of the time) having to write a single CUDA kernel.

However, working with functions of complex arguments, I noticed most of functions seem not to be implemented, except for basic operations and `exp`

.

Here is a simple code to test the missing functions :

```
using CUDA
CUDA.allowscalar(false)
T = Complex{Float32}
# T = Float32
A = CUDA.rand(T, 10);
R1 = similar(A);
E = exp.(A) # works
R = sqrt.(A) # works only with T = Float32
map!(sqrt, R1, A) # works only with T = Float32
```

I’m familiar with MapReduce frameworks (like Thrust in C++), so I tried to define the following :

```
function map_kernel!(f, A::CuDeviceVector{T}, B::CuDeviceVector{T}) where T
n, = size(A)
i = (blockIdx().x - 1)*blockDim().x + threadIdx().x
stride = blockDim().x * gridDim().x
for i in i:stride:n
A[i] = f(B[i])
end
return nothing
end
function my_map!(f, A::AbstractVector{T}, B::AbstractVector{T}) where T
n, = size(A)
@cuda threads=32 blocks=n (
map_kernel!(f,A,B))
A
end
```

and now this works :

```
R2 = similar(A);
my_map!(sqrt, R2, A)
```

This code is not generic enough but I don’t understand why the generic `map!`

function provided by Julia would not be able to do the same in my simple case.

Also, it would be even better if the broadcasting operator would work, as for Real numbers.

Could somebody explain me if I’m missing something, and why the broadcasted `exp`

function would work on complex arrays when others fail ?

Thank you,

Nicolas