Some clarity on modules and incremental compilation/loading times

Hi all,

I’ve read a bunch concerning modules and such, but haven’t managed to identify the best practice for my situation. My package currently has something like this:

module Example
include("file1.jl") # Some methods
include("file2.jl") # more methods

# Expensive compile-time cost. 
# Exposes `Example.Interface`
include("file3.jl") 
end

Most of the time my scripts just require using Example. Sometimes though, I also need using Example.Interface.

Assuming nothing has changed in the module since last compile, what I’d like is the quickest time-to-run for Example. With the setup I have listed above, I suspect that the submodule in file3 is still loaded via the using Example call. Is this true? I assume this means a time penalty?

If so, what are the options/standard patterns to separate these concerns? I’ve considered two possibilities, but don’t know which path to choose.

  1. Create a completely new package for Interface, it will call using Example to obtain the methods it requires.
  2. Don’t include file3 in Example. Make Interface a module in my package rather than a submodule.

If option 2 is the path I should go down, I don’t quite understand how I set up my environment to find the Interface module when I’ve already called using Example.

Just to rule out simple things first: are you aware of Revise? If so, what makes it unfit in your situation?

My understanding of what revise does is manages changes etc, reducing the need to restart. Regardless of whether I use it or not is perhaps outside of what I’m looking for in this instance.

# script.jl
using Example
do_stuff()
  1. start julia repl
  2. ] activate .
  3. include("script.jl") or includet("script.jl) with Revise loaded startup.jl

Then I might have misunderstood your question.

If you use Revise, you can have a long-lived REPL in which you repeatedly include your script. And if you change something in Example.jl between two invocations, Revise will update only what is necessary, so that things are very fast (whether you use Example or Example.Interface).

Isn’t this what you wanted?

2 Likes

I think it might actually be the case. TBH, I’ve never really managed to get Revise working properly; but took a stab at it again now based on your suggestion. You’re right, development wise things are now super fast - that’s great.

So you’ve essentially solved my problem in this case (thanks!), although just for my understanding I’d like to know which of the above patterns make the most sense, or is the solution in all cases “just use Revise”.

This suggests that Interface depends on Example. If that is the case, then modifications in Example can (and probably will) impact Interface. Therefore, if both are in separate packages, changes in Example will trigger a re-precompilation of Interface as well.

If you want to go this way, you might want to split your packages even more, with e.g. a ExampleCore package that contains only the functions needed by Interface. By depending only on this, Interface would need to be precompiled only when ExampleCore changes; the remaining part of Example (which would also depend on ExampleCore) could be modified without problem.

If inference really takes very long in Interface, and precompilation saves you lots of time, then that might be a viable option for cases where you restart the REPL often. But I’d bet that it would not save you as much time as Revise.

This option is not really practical IMO, for the very reason that you mention. AFAIK, doing this would require customizing LOAD_PATH, and there are very few good reasons to do this these days; I’m not sure your use case would be one.

And AFAIU you’d get precompilation only for the top-level module that has the same name as the package.


I’m quoting you a bit out of context here, but that pretty much summarizes what I think: always use Revise for development workflows.

3 Likes

Thanks a bunch. This has helped me understand quite a lot.