Summary thus far:
Thanks all for the input. I did some experimentation with the various suggestions and my results are posted below. Further input is welcome. However, this progress got me more firmly in the Julia camp. I was beginning to doubt the “runs like C” bit and was starting to develop a C program based on my Julia prototype for another project - glad I can give up on that approach. I suspect my Python code in the original comparison is probably in need of optimization as well, but no time for that.
FDTD Code CPU Time (seconds)
Original Case A Case B Case C Case D Case E Case F Case G
Compile 43.6 33.9 24.0 3.7 2.4 1.31 1.20 0.828
Run 1 40.9 33.2 21.8 2.9 2.1 1.11 1.09 0.609
Run 2 41.7 32.9 23.3 2.8 2.2 1.14 1.11 0.595
Run 3 42.6 32.3 22.3 2.5 2.0 1.13 1.08 0.657
Run 4 40.1 32.2 23.7 2.8 2.1 1.16 1.11 0.609
Run 5 40.5 33.1 22.7 2.9 2.1 1.14 1.12 0.625
Avg (1-5) 41.2 32.7 22.8 2.8 2.1 1.14 1.10 0.620
Ratio to C 96.6 76.9 53.4 6.5 4.9 2.7 2.6 1.5
- Original: Written by an EE relatively new to Julia.
- Case A: =Nested for-loops reordered to process array columns before rows.
- Case B: =Case A plus define global constants with “const”
- Case C: =Case B plus move Ce…, Ch… global fill terms within their respective functions.
- Case D: =Case C plus wrap entire block (other than ‘using…’) in ‘let…end’. Removed ‘const’ from global constants. Ce…, Ch… terms still in respective functions.
- Case E: =Case A plus ‘let…end’ wrapping all but the ‘using…’ line. Ce… And Ch… terms are not in their respective functions.
- Case F: =Case E plus ‘@inbound begin…end’ wrapping all but ‘using…’.
- Case G: =Case E plus ‘@ inbound’ individually on for-loops used to operate on 3D arrays.
The full code block in the first post corresponds to Case B
The full code block in the post by Mason corresponds to Case G