Following my discussion from the below thread, I have been trying out 2 approaches to evaluate Julia project organisation approaches.
Creating files as modules and “using” them in other files
Creating them as individual files and “include” them in other files
module MyModule
using Dates
using DataFrames
function MyModuleMethod()
println("some method")
end
export MyModuleMethod
end
using MyModule2
using MyModule
function MyModuleFunc2()
println("MyModuleFunc2 called")
end
export MyModuleFunc2
end
module MyRunner
using MyModule
using MyModule2
function __init__()
println("in init")
end
function main2()
MyModuleMethod()
end
export main2
end
runner.jl
baseDir = @__DIR__
@info "Starting in $baseDir"
cd(baseDir)
push!(LOAD_PATH, pwd())
using MyRunner
println("starting")
main2()
println("done")
This approach seems considerably slower than approach 2. While looking into the reason, I noticed that it was spinning up multiple Julia processes, using up a lot of memory on my machine. the second approach remarkably well. Its faster and doesn’t spin up multiple processes.
My machine:
Apple MacBook Pro M1 Max
Julia: 1.7.1
I am going with approach (2) for now.
I would like to know where I went wrong
It seems that there is a memory overhead(along with compilation overhead with Modules). that’s a bit of shame because I am trying to port a massive project to Julia. While I do this, I am also learning Julia. Revise.jl is excellent because it recompiles changes to Modules, so, separate small modules seemed like a great idea. Now, with approach (2), I have a single module, which takes a long time to compile. I must point out that subsequent compiles are pretty fast
yeah, I have been playing around and yeah indeed, my code doesn’t take much time. I still don’t know what caused the multiple Julia processes to spawn though
Petr mentioned it earlier: precompilation. Julia is a compiled language in normal usage. When you make a change to the source file on the LOAD_PATH, Julia will invalidate the compilation cache and perform precompilation. Note that this precompilation step consists of parsing and type inference, and not native code generation. As of Julia 1.6, there is now parallel precompilation:
Looking through both this thread and the other thread, I’m not sure if the importance of Project.toml has been fully appreciated. In particular, the UUID is really important in terms of precompilation and caching. By modifying the LOAD_PATH you are bypassing these normal mechanisms of code loading.
In the previous thread you mentioned Java. In Java, a class file is the basic compilation unit. The name of the class is qualified by its Java package. In Julia, a package is the basic compilation unit. It is qualified by its UUID in Project.toml. If there are two package modules of the same name, Julia’s package manager can distinguish them via their UUID.
What I have not seen from you above are any Project.toml files. What I have seen is LOAD_PATH hacking. Do away with the LOAD_PATH hacking. Make each of your modules into true packages, which can be compiled and cached separately.
@mkitti thanks for your response mate. I am sorry, I am a newbie to Julia. I have since read about the Project.toml and see what you mean. I will give it another shot and come back to this thread.