'using MyModule' is costly

See my experiments below:

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.0-DEV.127 (2018-08-27)
 _/ |\__'_|_|_|\__'_|  |  Commit 3ab56f19a8 (18 days old master)
|__/                   |

julia> @time using SeekSCC
[ Info: Recompiling stale cache file /home/user/.julia/compiled/v1.1/SeekSCC.ji for SeekSCC [top-level]
 29.974906 seconds (35.58 M allocations: 1.762 GiB, 3.04% gc time)

julia> @time SeekSCCSFuncA()
144.620283 seconds (156.46 M allocations: 17.789 GiB, 4.09% gc time)

The precompilation time is 30 seconds, and this does not even count precompiling modules used by this module because those scripts have not changed since last time. This is unavoidable when any part related to the module has been changed.

Now do it the second time without changing anything in the scripts.

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.0-DEV.127 (2018-08-27)
 _/ |\__'_|_|_|\__'_|  |  Commit 3ab56f19a8 (18 days old master)
|__/                   |

julia> @time using SeekSCC
 16.986347 seconds (34.49 M allocations: 1.710 GiB, 5.06% gc time)

julia> @time SeekSCCSFuncA()
154.848272 seconds (156.46 M allocations: 17.786 GiB, 3.96% gc time)

As expected, there was no precompilation this time. So the “using SeekSCC” took less time, 17 seconds vs the prior 30 seconds. But there is no benefits to running the script, with or without precomilation.

Now let’s do this with “include”:

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.0-DEV.127 (2018-08-27)
 _/ |\__'_|_|_|\__'_|  |  Commit 3ab56f19a8 (18 days old master)
|__/                   |

julia> @time include("SeekSCC.jl")
 19.288917 seconds (36.58 M allocations: 1.815 GiB, 4.86% gc time)
Main.SeekSCC

julia> @time SeekSCC.SeekSCCSFuncA()
141.394547 seconds (156.43 M allocations: 17.787 GiB, 3.90% gc time)

Note the script being included here has some “using OtherModule” statements. But anyhow, the “include” statement took 19 seconds, quite less than with precompilation and a bit more than without precompilation. And it stays about the same (disregarding the “using OtherModule” statements inside) at a new start of Julia, regardless of whether the scripts have been changed or not.

What’s even better with “include” is that in the same session of Julia whenever you change the script, you can do “include” again and it comes back in much shorter time:

julia> @time include("SeekSCC.jl")
WARNING: replacing module SeekSCC.
  0.391219 seconds (63.30 k allocations: 3.678 MiB)
Main.SeekSCC

But with “using MyModule”, this is not the case. When the script is changed, Julia session has to be restarted to take in these changes, and you then see the precompilation process. There is the package Revise to minimize the restarts, but it often times fails to work (it’s stated that it has its limits). Even when Revise worked through a session of script changes, the next time you start Julia and do “using MyModule”, the precompilation will take place (because of the changes you’ve made since the last precompilation).

And, in my case, as you see in the above examples, the recompilation doesn’t seem to provide any time saving benefits. It only takes more time (than the simple “include”).

So am I missing something here?

Check your global scope: do you have lots of computation outside functions?

No global stuff, only a small number of consts in modules’ scopes. But why is it relevant here?

So, in this case it seems precompilation is only useful if you’re going to be using the module at least 5 times before making any change. When I tested this on one of my own modules, precompilation paid off on the second using vs include, so there seems to be a lot of variation.

Most users are likely to be using a module more times than that between releases, but for developers it could be a bit of a hassle (although mitigated by Revise.)

You could put __precompile__(false) at the top of your module while developing, and then remove it just before release.

Personally, I run Pkg.test rather than using my module when developing. I’m not sure if that does precompilation or not.

2 Likes

There’s a performance regression in 1.0 where large union types (like StridedArray) cause very large using times. You might be hitting this.

1 Like

I am afraid that is not applicable here. There is no union type in the script, and I had not heard of StridedArray before.

Use Revise.jl for this.

2 Likes