I gave myself an exercise to help learn several aspects of Julia, including coding style and performance issues. The task was to translate a C code for a very simple 1D, FD implementation for acoustic waves. At the end of this post, I copy in the entire key section of the code for reference.
To run it, I used @time and @timed as:
My first effort was to implement the differencing using, e.g.,
and I compared to a brute force loop approach that copies the original C code:
for iv in 1:(length(dv)-1) dv[iv] = p[iv+1]-p[iv] end
The questions are:
The second form, with the
forloops, takes about 40 sec, but the form with
(v-circshift(v,1))is about 2.5 sec. I expected the first version to be faster, but I am surprised by the factor of ~15 - can someone explain why?
@timed indicates that the first form has allocations of about 5GB and the second 20GB (which might be related to speed). I interpret this as memory allocations, but I don’t understand the large amounts. The total array sizes should be a few hundred KB at most.
I tried it with Julia 0.7 (code version 1 speeds up to ~2 sec, yay!). I noticed that 0.7 really really wants me to put the global qualifier in front of relevant vectors in the for loops. I did that, but I’m not quite sure why it is important (minor quibble: it distracts from code readability, at least for this learner).
Any comments would be appreciated.
c = fill(2500.0,nNodes) ρ = fill(1000.0,nNodes) dp = zeros(nNodes) p = zeros(nNodes) v = zeros(nNodes) pStepWeight = (ρ.*c.*c)*dt/dz for iTime in 1:nTimeSteps global dp = (v-circshift(v,1)).*pStepWeight global p += dp p = 0.0 p[iSource] += source(iTime*dt, dt, dz, t0, α) global dv = (circshift(p,(-1))-p)./ρMean*dt/dz global v += dv v[nNodes] = (ρMean[nNodes]*dz - dt*ρ[nNodes]*c[nNodes])*v[nNodes]-2*dt*p[nNodes] v[nNodes] /= (ρMean[nNodes]*dz+dt*ρ[nNodes]*c[nNodes]) end