See the plan_fft docs:
To apply P to an array A, use P * A; in general, the syntax for applying plans is much like that of matrices. (A plan can only be applied to arrays of the same size as the A for which the plan was created.) You can also apply a plan with a preallocated output array  by calling mul!(Â, plan, A). (For mul!, however, the input array A must be a complex floating-point array like the output Â.) You can compute the inverse-transform plan by inv(P) and apply the inverse plan with P \  (the inverse plan is cached and reused for subsequent calls to inv or ), and apply the inverse plan to a pre-allocated output array A with ldiv!(A, P, Â).
Note that for a real-valued input array, you may be able to achieve better performance with rfft/plan_rfft, since a FFT on real input produces Hermitian-symmetric output with redundant negative frequency terms.
edit: for 1024 x 1024 real input,
julia> @btime mul!($û_rfft, $p_rfft, $u)
3.237 ms (0 allocations: 0 bytes)
julia> @btime mul!($û_fft, $p_fft, $u_complex)
22.086 ms (0 allocations: 0 bytes)