What if a package consisted of multiple top-level files?


#1

It seems to me that it might make sense to have a package consist of multiple (dozens?) related files that the user does not necessarily need all at once. What is the standard mechanism for dealing with this? (Is there one?)

So, let’s say the package is called P1. There is a P1.jl file (top-level), but then there are let’s say three other files. When as a user I want to use some function I may need one of these three. Presumably there is a module in that file. How do I load that module for import? Do I have to use Pkg.dir()?


#2

Namespaces, i.e. using inner modules.


#3

Inner modules would require though that the entire file that defines the enclosing module will be loaded, correct? That is something I want to avoid.

We are all familiar with packages that takes a long time to load. I want to make it possible to construct lean-and-mean subsets of the package.


#4

Specifying what is getting loaded up front allows for precompilation. This will likely be a much larger performance increase than trying to “hot load” new code. For example, DataFrames loads ~10x faster with precompilation than without.


#5

So are you saying that no matter how big the package gets pre-compilation will take care of the load time?
Isn’t there a price that one has to pay in terms of the resources consumed that are involved in loading a huge package?


#6

This question seems to tie in with the question about package sizes:

Maybe what you want is for some package to be split into smaller packages, to simplify it


#7

Yes, indeed. But at the same time I want to keep the relationship of the “sub”-packages intact. Right now I have the package split up into dozens of modules. These however get all loaded at the same time whenever I say using MyPackage, whether I need them or not.

So really what I’m looking for is some way of having multiple packages under the umbrella of one top package. Does anyone know if there is a beast like this around already?


#8

Julia is JIT compiled and the cost of compiling functions is typically much larger than just “loading” the code. It is unlikely that you will beat the 10-50x performance boost of precompilation, while keeping the same functionality.

Your package https://github.com/PetrKryslUCSD/FinEtools.jl is currently running with precompilation off. Enabling it makes load time go from 5 seconds to 0.08 seconds. What part of those 0.08 seconds are you trying to improve and do you think it is relevant comparing to other things like compilation time when functions are actually called?


#9

Have you tried something like this?

module stuff
    module here
        export x
        x = 1
    end
end

then you can do using stuff without it loading the here module. Then if you need to load that too, you can execute using stuff.here and then that will be loaded. Is that what you are want?


#10

Maybe I’m not understanding, but this is what I get:

julia> module stuff
           module here
               export x
               x = 1
           end
       end
stuff

julia> using stuff

julia> stuff.
eval here
julia> stuff.here
stuff.here

julia> stuff.here.x
1

I. e. no need to do using stuff.here. It is already there when I do using stuff.


#11

This is what I get:

julia> module stuff
           module here
               export x
               x = 1
           end
       end
stuff

julia> using stuff

julia> x
ERROR: UndefVarError: x not defined
Stacktrace:
 [1] macro expansion at ./REPL.jl:97 [inlined]
 [2] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73

julia> using stuff.here

julia> x
1

So you have to call using stuff.here in order for the here module to be loaded into main.

Of course, you can always do stuff.here.x no matter what, but that is independent of using.


#12

But if you can do stuff.here.x then presumably that code is already loaded?


#13

Oh okay, well if you want to have code in your package that doesn’t get loaded, then you will have to just simply include it in a separate file like src/extra.jl. Then you would have to do cd(Pkg.dir("Name")) and then include("src/extra.jl") to load that code.

However, you’re probably better off using precompile.


#14

That is a good point. However my package has around five thousand lines, give or take. What if I managed to rewrite Nastran with its five million lines of code? Are we then talking about eighty seconds to load the package?


#15

No, then you would use something like https://github.com/JuliaComputing/static-julia to AoT compile your code into a binary.


#16

That is a good idea. Thanks!


#17

Another thing you could do is put your extra module ExtraModule into mod/ExtraModule.jl and then in your main module PkgName.jl script add a line like this

push!(LOAD_PATH, joinpath(Pkg.dir("PkgName","mod")

Then you will be able to do using ExtraModule and it will find the code for it in that folder.


#18

That is true. Neat! Thanks.


#19

Pkg.dir is not ideal to use like this because it will fail if the package is installed somewhere else than Pkg.dir. Use relative includes instead like joinpath(@__DIR__, "..", "mod").

Also, relying on user controlled global behavior like LOAD_PATH is quite fragile and possibly confusing for users.


#20

True, at that point it probably just makes sense to split it into multiple separate packages anyway, since you’d have to worry about ExtraModule possibly conflicting with another package name, the way it is used at that point is essentially as if it is another package anyway.