I think there’s no implementation for kron of vectors, it just reshapes to use the one for matrices. And there’s some overhead to that which, for very very small arrays, becomes the dominant time.
julia> size© # vector case
(4,)
julia> @btime _kron!($C, $A, $B);
min 11.136 ns, mean 11.307 ns (0 allocations)
julia> @btime kron!($C, $A, $B);
min 71.422 ns, mean 93.837 ns (6 allocations, 288 bytes. GC mean 21.04%)
julia> @less kron!(C, A, B); # this reshapes to make matrices
julia> @btime reshape(reshape($C,2,2),4); # in fact it needs 3 reshapes, not 2
min 43.939 ns, mean 59.453 ns (4 allocations, 176 bytes. GC mean 22.68%)