The behavior is working as intended, and errors with Array too:
julia> A = zeros(16);
julia> B = ones(8);
julia> copyto!(A, B);
julia> copyto!(B, A);
ERROR: BoundsError: attempt to access 8-element Vector{Float64} at index [1:16]
The fact that you’re getting an ERROR_INVALID_VALUE instead of a bounds error with CUDA.jl reveals that you’re running with --check-bounds=no, which is the problem here.