If it is radial distance, I think the correct diffusion operator should be
kₓ*(Dx(x*Dx(T(x,y,t))))/x
and you get
But if you used exactly the same equation kₓ*Dxx(T(x,y,t))
in your reference solution it still does not explain the delay you observed. If on the other hand your reference solution discretized the correct radial operator, then maybe you have an explanation.