I think Optim.jl is doing a great job of using descriptive filenames. In fact I think it can be very helpful for understanding a large package that there is a list of files like that, somewhat abstractly dividing the package into concepts like multivariate/solvers/first_order/l_bfgs.jl.
But even in worse organized packages, :vimgrep /function some_func/ src/** normally gets me where I need to go.
But how about a function with a lot of methods, such as solve or alike? Then how do you locate the exact version of the definition of that method for a specific dispatch?
But how about a function with a lot of methods, such as solve or alike? Then how do you locate the exact version of the definition of that method for a specific dispatch?
Probably this was mentioned before somewhere (like possibly everything else in this post), but explicitly listing the imports wouldn’t help much:
julia> module A
abstract type T end
foo(x::T, y::T, z::T) = "A"
end
Main.A
julia> module B
import ..A
struct SB <: A.T
end
A.foo(x::A.T, y::SB, z::A.T) = "B"
end
Main.B
julia> module C
import ..A
import ..B
struct SC <: A.T
end
A.foo(x::A.T, y::B.SB, z::SC) = "C"
end
Main.C
julia> import .A: foo
julia> import .B: SB
julia> import .C: SC
julia> foo(SC(), SC(), SB())
"A"
julia> foo(SC(), SB(), SB())
"B"
julia> foo(SB(), SB(), SC())
"C"
It’s even worse if foo is called within a generic function where one even doesn’t know the argument types. Also, C might add the specialization at any point in time after the parent module was written.
Code navigation and discoverability is harder in Julia than in OOP languages. But I doubt that listing all imports in every single file makes this significantly easier to be worth the additional effort in Julia. I would prefer improved tooling like @less to help with this.
I also don’t think that a file should be a meaningful unit of dependency. In Python, I have occasionally used comments to mark different “sections” of a module. I like that in Julia I can split such “sections” into multiple files and include them in a master file that can serve as a “table of contents” (in my opinion it doesn’t always make sense to split internal, highly specialized code into submodules that are only ever used/useful in the parent module).
Let us not forget that a programmer today has a lot of technology to help with understanding code. For instance I use Sublime Text editor a lot. It has fantastic indexing tools. Include source files / folders in a project, and one can just point at a symbol to find where it is defined, where it is used, what sort of object it is, and so on.
You go through the files manually? Usually, searches work fine even a simple (rip)grep or GitHub search. I don’t see how another module system would avoid this. The only way would be to be explicit when using any variable, struct or method, which would be verbose.
For me the main problem is that the use of modules in an application (which is not a package) is not well supported. In my current project I was first using the style:
if ! @isdefined KCU_Sim
include("KCU_Sim.jl")
using .KCU_Sim
end
just to make vscode happy. Ugly.
Now I just added my source directory to the module search path, so I can say
using KCU_Sim
Advantage:
precompiling works, startup time acceptable
Disadvantage:
“goto definition” etc. in vscode is not longer working
So there IS room for improvement, both in Julia and in vscode …
Ahh, it finally clicked for me! Thanks for your patience (and the patience of the other experts in this thread). It took me an embarrassingly long time to stop conflating files and namespaces subconsciously.
Tracing the dependency DAG is a matter of tooling, whether the module is all in one file or not - just needs smarter tooling in the latter case. I also changed my mind on the value of using submodules as a method to signpost the dependency flow. Either there is redundancy (keeping the name of something the same in both the inner and outer module and needing to say so on both sides of the boundary using explicit export/import statements) or confusion (if the name gets changed going across the boundary).
The guarantee that includegives when reading a package (assuming that the package isn’t triggering redefinition warnings) is that a name means the same thing in any file – because they’re all in the same namespace. The warning mechanism is the key, though.
In my understanding this covers only some of the things modules can be. Here’s a list of what modules in Java 9 aim for:
According to JSR 376, the key goals of modularizing the Java SE platform are
Reliable configuration—Modularity provides mechanisms for explicitly declaring dependencies between modules in a manner that’s recognized both at compile time and execution time. The system can walk through these dependencies to determine the subset of all modules required to support your app.
Strong encapsulation—The packages in a module are accessible to other modules only if the module explicitly exports them. Even then, another module cannot use those packages unless it explicitly states that it requires the other module’s capabilities. This improves platform security because fewer classes are accessible to potential attackers. You may find that considering modularity helps you come up with cleaner, more logical designs.
Scalable Java platform—Previously, the Java platform was a monolith consisting of a massive number of packages, making it challenging to develop, maintain and evolve. It couldn’t be easily subsetted. The platform is now modularized into 95 modules (this number might change as Java evolves). You can create custom runtimes consisting of only modules you need for your apps or the devices you’re targeting. For example, if a device does not support GUIs, you could create a runtime that does not include the GUI modules, significantly reducing the runtime’s size.
Greater platform integrity—Before Java 9, it was possible to use many classes in the platform that were not meant for use by an app’s classes. With strong encapsulation, these internal APIs are truly encapsulated and hidden from apps using the platform. This can make migrating legacy code to modularized Java 9 problematic if your code depends on internal APIs.
Improved performance—The JVM uses various optimization techniques to improve application performance. JSR 376 indicates that these techniques are more effective when it’s known in advance that required types are located only in specific modules.
Some of these are not relevant, but it would be beneficial to be able to understand dependencies between modules better. It would also be useful to be able to create a module, add packages for that, and have some ensurance that the packages you add don’t mess up with code that you didn’t want changed.
Even better, I use parametric include statements . I (ab)use a file containing a declaration of a struct with default values (courtesy Parameters.jl) as a parameter file for my simulation and the location of that file can be given on the command line. I am aware that in most contexts this would be a terrible idea, but for this particular use case I find it quite handy.
Edit: Oh and I forgot, I do the same thing with “scenario” files, which change some aspects of the simulation and are loaded on demand. Basically like a plugin system, but really simplistic.
Nice, I didn’t know about that one, I’ll look into it. In any case I could also just read parameters from a json file or something like that.
The one advantage of doing it the way I’m doing it - besides being dead simple - is that I have a single point of definition for parameters, their default values and their documentation (I’m using the doc strings of the struct members to generate help text for the command line) that immediately results in a runnable model. Plus, no additional syntax for the parameter file.
ModelParameters is for working with a single point of definition for those things, but by automating all the struct reconstruction instead of including code. Parameters, labels. priors, bounds, whatever else you need can be on rows of a csv and attached to model parameters.
There’s no syntax except defining the (potentially nested) structs with Param wrappers where the parameters should go. You can use Parameters.jl or Base.@kw_def to set those defaults, similar to how you are now.
It’s not always that someone can’t think outside the python box. It can be that they understand both Julia and Python to some degree and see some language-feature that one has and the other doesn’t. Or even if both languages have the feature, they see an advantage in borrowing a bit from cultural practices.
I learned Python after Julia and have far more experience in Julia. But, after having significant experience in Python, I started more often encapsulating code within a package to avoid that “A.jl and B.jl implicitly depend upon utils.jl”. I’m using Julia’s existing module system at the moment. For some projects, most of the code in each file is used only within the file. So, I tend to wrap the code a module. I often don’t even import symbols from these modules, but rather just the module name, and then fully qualify the symbol at the call site. Knowing where stuff comes from helps me remember how I organized the code. And making the dependencies explicit will make the code more maintainable as it scales up.
There are many Julia practices that I try to introduce in the Python code. And a few in the other direction. I should add that I am not a Python fan-person. I absolutely hate writing scientific software in Python, because it is fundamentally so ill-suited for that task.
EDIT: I also understand the desire to push back against extraordinarily-difficult-to-defend opinions, like: Julia doesn’t have a proper module system, or Julia was designed to be used only from Jupyter notebooks.
Yeah I wasn’t suggesting it was a limitation of python people not thinking outside the box, more that learning a language gives you a tendency to look for familiar structures - and as you say sometimes that may be a good thing. I know I avoid modules even when I reach the complexity where they would help.
Personally I wrote more Ruby, C++ (and ugggh PHP) than Python, and that experience makes Julias include seem normal and obvious, I don’t think about doing any of the things people are worried about here.
Question of this thread (if I understood OP) is probably:
Why Julia doesn’t solve some problems (for example isolation and de-duplication) at module level?
In other words: What is the benefit of modules which doesn’t solve that automatically and require burden with include mechanism?
I am not trying to criticize Julia’s paradigm! I am just curious. It could be nice and useful to see ideas behind language design.
What I wanted to say with my reaction to Raf is that C++ designers (and not only them) probably don’t like include machinery very much.