V1.0.3. Type conversion or ceiling division? One line example

The last line executes pretty fast in global space.

pids=2:13;NN=138000;NN1=18000;NN2=120000;zTop1x=1.0;hLay2=0.2;
zut=SharedArray{Float64}(NN, init = S -> S[localindices(S)] .= 0.0,pids=pids)
TD2=SharedArray{Int64}(  NN2,init = S -> S[localindices(S)] .= 0,  pids=pids)
trsLayPrev = SharedArray{Int16}(NN, init = S -> S[localindices(S)] .= 0,pids=pids)
TD2[:]=NN1.+eachindex(zeros(NN2))
trsLayPrev[TD2] .= Int16.(cld.(zut[TD2].-zTop1x,hLay2)).+Int16(308) (line A)

Also, @spawnat 2 of all lines above, encapsulated in some function bio() finishes in rather short time too.
Lines above are a slice of my original project. In that project return statement is placed right after the last line in that slice present, for debugging purposes. I’ve moved return statement up and down within the project and figured out, that last line prevents project to compile within 7 minutes.:face_with_raised_eyebrow: Julia 0.7.0 finished compilation of whole project within 6 minutes.:neutral_face: Julia 1.0.3 didn’t finish project compilation within 20 minutes.:unamused:
But if i replace line A with

trsLayPrev[TD2] .= round.(Int64,(zut[TD2].-zTop1x)./hLay2).+308  (line X),

compilation time for original project slice up to the line X drops to a few seconds.:triumph:
UPD
Behavior was agnostic to acceptor array type. The other one wasn’t shared and had triggered same issue with the same RHS as in line A.

Julia Version 1.0.3
Commit 04330c0 (2018-12-16 21:23 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, sandybridge)
Environment:
  JULIA_NUM_THREADS = 12
Julia Version 0.7.0
Commit a4cb80f (2018-08-08 06:46 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, sandybridge)