That may be due to bad performance of captured variables, because right and left are mutated in v0 and are constant in v1, and @spawn creates a closure to execute as a new task.
What if you write v0_p_quicksort with
...
iright, ileft = right, left
t = Threads.@spawn v0_p_quicksort!(A, i, iright)
v0_p_quicksort!(A, ileft, j)
?
That, if my understanding is correct, should make the performance of both versions equal.