Implicitly loaded modules in the future?

But why do I have to distinguish between development and production? These things always go in a cycle, so I will have to keep switching my includes around which is a nuisance and a source for errors.

1 Like

Then keep it in the top package source. I do. E.g. MeshCore.jl/MeshCore.jl at master · PetrKryslUCSD/MeshCore.jl · GitHub
Edit: better example (abstract + concrete) in FinEtoolsDeforNonlinear.jl/FinEtoolsDeforNonlinear.jl at master · PetrKryslUCSD/FinEtoolsDeforNonlinear.jl · GitHub

More abstractly speaking, the point is that code often has the structure of a directed acyclic graph, and the Julia module / include() system does not allow me to reflect that structure. I believe keeping that structure around would be useful throughout the life cycle of a piece of software, and putting all the includes in the “entry file” of a package does not do that.

Of course, this doesn’t mean that the current system is unusable. I’m just expressing my believe that there might be some room for improvement (though I have no idea what that improvement might look like).

8 Likes

I also think include is not a great way to organize code and one of Julia’s weak spots. I compared it to Go in another thread:

In Go, you can’t include files: you can import packages and one package = one directory and each file has an explicit list of imported packages at the top, and importing names in the main namespace (like using in Julia) is strongly discouraged. In Julia, files can include files can include files… Open a file, you don’t know in which context it is included so you can’t be sure how it’s going to be interpreted. Maybe it’s even included several times in different contexts? Maybe the behavior depends on the order of includes?

Another way to put it: declarative (rather than imperative) code is generally considered more robust or at least easier to analyze and verify. In Go, you write imperative code in functions, but the overall organization (definition of packages and functions) is declarative. In Julia, this organization itself is imperative.

Note that Jeff voiced some agreement :slight_smile:

8 Likes

^ I completely agree, and this is the key point IMO.

Open a file, you don’t know in which context it is included so you can’t be sure how it’s going to be interpreted. Maybe it’s even included several times in different contexts? Maybe the behavior depends on the order of includes?

And problems like this are the result.

It’s actually pretty straightforward to demonstrate genuine spooky-action-at-a-distance as a result of the typical include() pattern.

I’ll reiterate the mention of FromFile.jl from earlier in the thread – this package introduces a custom import system that fixes all these issues. Never write an include() again.

12 Likes

What if modules behaved as Pluto notebooks? (what you see is what you get). That could be even the place for a macro.

This looks interesting as a solution to the modules issue. But it would be better if something similar was a standard language feature because if developers start creating special packages to fix the language then we are inventing a special dialect which a regular Julia developer will find hard to follow.

My opinion on the meta-programming features of Julia is that it can certainly be useful to make coding more compact but at the same time it can lead to a dialect that most people will not be able to understand until they read all the associated documentation.

It seems to me that requiring users to care about files is counterproductive. If a user needs some functionality, it should be in a package, and no include is needed. Developer is free to organize things into files, based on the DAG representation of the code (in the developer’s head), but why would a macro be needed for that? Just decide on what should the physical representation into files look like, a then use import. In my mind files and modules are independent concepts. I could have organized my package as a single giant file, but the modules would work just as well as when I keep it one module per file.

2 Likes

Isn’t the solution here to use dev appropriately? The cost is that you have to worry about multiple Project.toml files, but maybe that should be viewed as a good thing because it clarifies the DAG?

shell> tree
.
├── dev
│   ├── Sub1
│   │   ├── Project.toml
│   │   └── src
│   │       └── Sub1.jl
│   └── Sub2
│       ├── Manifest.toml
│       ├── Project.toml
│       └── src
│           └── Sub2.jl
├── Manifest.toml
├── Project.toml
└── src
    └── Test.jl

6 directories, 8 files

It doesn’t solve the problem of “what’s the context this file is evaluated in right now”. But I’m not sure that particular problem bothers me that much because it’s just struct definitions that mess that up and I appreciate the flexibility of splitting code across multiple files.

6 Likes

This approach doesn’t scale as well. An include()/package-based approach is:

  • Heavyweight, requiring the creation of a package at every branch point in the code dependency DAG. That’s a lot of overhead just for some helper file.
  • Requires a developer to look up O(n) code in the size of the codebase n to track the dependencies of any individual piece of code. This is as opposed to just O(log n) in a well-designed file-based approach.
  • Implicit. Within each package, dependency DAGs are held only in developer’s heads, instead of being explicitly written down. Harder to on-board people / easier to make mistakes.
  • open to spooky-action-at-a-distance. It is possible for construct natural examples in which changes in one set of files will affect method resolution in unrelated parts of the code.
8 Likes

A beginner’s perspective here.
The only thing I’m missing in the language is the ability to easily import local modules (a small module I have somewhere).
Now you have to include(“MyModule.jl”) and using .MyModule, which has the drawbacks of include, or generate a package. Even though packages are very lightweight, I still think it’s a hassle to generate a package and then add or dev it in the environment. FromFile seem to offer a nice solution for that.

It would give extra flexibility to be able to make a Module/namespace while working on a scientific project for instance.
Now I kind of understand how to use dev/add and include more or less properly, but it took me a lot of reading, scrolling through discourse, reading and rereading the manual…
It is definitely not straightforward for beginners but neither for more experienced programmers it seems considering the number of posts created on this subjects.
I think beginners would very much appreciate if there was a super easy way to call modules.

4 Likes

That’s not entirely why this happens. It partly is but that’s only half the story. The key fact here is that unlike static languages, there is no different “type context” the right hand side of a type annotation is simply an expression like any other, which is evaluated when the definition is evaluated. In this case the right hand side “looks like” a type, M but it could just as easily be an arbitrarily complex expression like identity(M) (which of course evaluates to M) or something even more complex like cond() ? M : O. This lack of distinction between “type language” and the real language is very powerful and allows people to do very nifty things without needing additional language features (just use the normal language features you already know), but it does mean that when you’re evaluating f(x::M) = x.m you need to already know what M is since otherwise you can’t evaluate it as an expression.

How does this work in static languages? Well, there’s only certain things that can appear on the right hand side of :: — i.e. names of types (not expressions that evaluate to types, but specifically names of types). So you can assume that M is the name of a type even if it hasn’t been defined yet and you can wait until the type gets defined to evaluate the definition.

This isn’t going to change and isn’t related to modules at all, so is a bit of a digression from the primary subject of this thread.

23 Likes

The way forward here is a finalized design and implementation for #4600. We were making some progress towards that for a while with a good productive discussion. The last post which got a lot of likes was by @patrick-kidger. However, I find the design there problematic in a few different ways:

  1. This may be superficial, but with the leading from blah import syntax that’s proposed is far too Python-influenced and doesn’t fit with how imports work in Julia, which is import|using followed by an identifier of what module to import followed by names to import.

  2. It has way too much flexibility and features: the ability to specify a file name and one or more modules and multiple names to import is way over the top. Imports is already an aspect of the language with too much surface area and variations, which we want to reduce, not increase even further. A proposal with this many variations is not going to fly.

I got kind of fatigued by that discussion but if these issues can be addressed and we can make some forward progress then we could get somewhere, which would be good.

22 Likes

First of all, I would like to reiterate that function definitions are declarative, not imperative. You can define functions in any order you like:

julia> f() = g();

julia> g() = a;

julia> const a = 1;

julia> f()
1

What’s not declarative is struct definitions. You can’t refer to a type that hasn’t been defined yet. Suppose for a moment that Julia handled modules like Python does. Suppose I have three types with this DAG: A → C ← B, and suppose my modules look like this:

module Mod1

export A

struct A
    x::Int
end

end
module Mod2

export B

struct B
    x::Float64
end

end
module Mod3

using .Mod1
using .Mod2
export C

struct C
    a::A
    b::B
end

end

Ok, great. Now Julia will automatically figure out the correct order to load code, so I don’t have to use include().

But what if I change Mod3 to this?

module Mod3

using .Mod1
using .Mod2
export C

struct D
    c::C
end

struct C
    a::A
    b::B
end

end

Bam, now I get ERROR: UndefVarError: C not defined. The order of defining structs still matters inside a module! Maybe I should put every single struct definition into its own module, but that seems absurd. Besides, it’s just as easy to mess up the import statements as it is to mess up the order of definition of the structs, so you would just be trading one kind of runtime error for another kind of runtime error.

The bottom line is that struct definition order matters. Having declarative module dependencies with automatic code loading neither changes nor alleviates that.

7 Likes

I’m definitely glad to hear that there’s interest in making this happen if these issues can be addressed.

I think it should be enough to just switch the syntax from
from "file.jl" import obj
to
import "file.jl": myobj.

Which I think would fix both issues.

4 Likes

Take a look at DataFrames.jl. There are tons of files included, some of which define types. If they were executed out of order that would cause problems.

But on the other hand, if each file had to list it’s dependencies on other files, that would be a cure worse than the disease. Every refactor would entail updating every other file. Each file would have to start with dozens of headers. Keeping track of that DAG would be a nightmare.

For me, the idea that Main is just another module, and everything living together inside modules is fine, because it’s a simple rule that is easy to understand, even if some of the behavior resulting from that rule is unintuitive.

6 Likes

I fail to see how throwing files into the mix will reduce the cognitive load!?

1 Like

I understand your arguments! I appreciate your points, but I’m afraid I don’t reach the same conclusion.

Fundamentally, if the dependency DAG is a “nightmare” then that’s probably an indication that things needs organising better – regardless of your choice of import/include framework. Having an implicit nightmare DAG sounds like a worst-of-both-worlds when it comes to writing bug-free/readable/multi-developer code.

7 Likes

It’s literally the exact opposite. The reason why include is heavyweight is because you have to write down the dependency chain. The other approach is to have spooky-action-at-a-distance resolve it for you, hopefully in the right order, where by virtue of file names and magic things just exist that you would otherwise have to do by hand. That’s the natural engineering trade-off between being explicit about what code is included and being implicit.

The include approach is very explicit about exactly what code is executed and in which order, while the approaches are implicit and “work it out”. If two of the submodules define the same global variable, what value does it end up with? There’s no explicit ordering: you’re at the mercy of FromFile.jl magic to define the ordering for you.

That’s not to knock the method at all: if some people feel that FromFile.jl helps you design things in a way you like, then go for it. But implicit non-deterministic action is the downside to not requiring that someone says “line 1 goes before line 2”, the exact requirement which is being argued that some people don’t want from include. You have to choose between the two, but let’s not stretch the advantages of either too far.

8 Likes

I think a main source of disagreement is file size. Is it my understanding you are used to working with very long files in python?