There have been questions about this before, but so far none of them have provided an answer at the level of detail I have been looking for.
My current understanding is that include is not a simple text substitution/replacement operation, which is how the C/C++ preprocessor works.
Those languages simply substitute the line include("myfile.c") with the contents of myfile.c whenever such an include statement is encountered.
Actually, in C/C++ the statement is #include "myfile.c" but you get the point.
From reading elsewhere, it looks to me a bit like include parses the target file and generates an AST.
The question I really wanted to answer relates to a problem which occurs with include in C/C++.
In these languages, it is possible to #include a file from multiple other files. This creates multiple definition errors, because from the point of view of the compiler, multiple copies of code containing the same names have been defined.
You are not allowed to define struct A in one place and then struct A in another place, and then try to run the linker to link the two bits of object code together.
Does this same problem occur in Julia? What does Julia do to solve this problem, or if it doesnât exist, then why isnât this a problem to consider?
The include statement evaluates the given file in the context of the current module. This is very subtly different from just âpasting the content of the file at the location of the include statementâ. If the include statement is at the top level of a module, then it is essentially equivalent to pasting the contents of the file. And indeed, that is how include is generally supposed to be used. What does not work, however, is using include inside the body of a function. Thatâs an example where the understanding of include not just pasting in code becomes relevant.
There are also versions of include that process the AST of the file before evaluating it, and that evaluate it in a different module than the current one. Those are pretty rarely used, though.
It is definitely possible to include the same file in multiple locations in Julia, just not recommended. If the includes are in the same module, then the second include will overwrite variables/functions that were injected into the current module from the first include. This will typically cause Julia to emit some warnings about overwritten methods. And, of course, itâs probably not something you want to do (what would be the point?).
Including the same file in different modules isnât that much of a problem, but then you create multiple functions, e.g., A.f, and B.f that are different objects, but run the same code. Again, probably not something you want, but not a âproblemâ as such, the way you describe in C/C++. There are situations where multiples includes can be useful. For example, in QuantumControl.jl, I include a reeport.jl file in multiple submodules of the package. This is a private helper function (a patched version of ReExport.jl) that I need inside each module.
Generally, though, including the same file in multiple places has a very high chance of confusing both you and your tools (e.g., VSCode). So youâll want to avoid it unless you have a very good reason to do it.
Thanks for clarifying, I see that I didnât quite understand the situation accurately.
Ok this gives me something to think about.
For context:
I used to write a lot of C++, and typically as code bases became larger, it could sometimes be difficult to get the compiler to compile your code because of this multiple definition problem. Or, to put it perhaps more accurately, sometimes you might want to do something a certain way, you might want a particular structure in terms of what code lives in which files, and you might be prevented from doing so.
Iâve got to be honest, it has been quite a few years since I wrote a line of C++, so I wonât try and provide an example. Itâs not trivial to conjure such a situation.
The reason for asking is Iâm interested whether the same issues can arise in Julia. I suspect the answer is no due to a number of factors. It isnât easy to explain why and Iâm not certain my thoughts on this are correct right now.
One factor which is important is that in C++, the compile stages and link stages are independent. In addition, you can declare a function in one place and then define it elsewhere. I think these additional factors cause the problem, and Julia not having them means there is no problem.
I could be mistaken. Iâd have to think about it and think back and remember more details of how the C++ compiler works.
Concretely, there are two common idioms that allow you to access the contents of a given file, say utils.jl, in multiple places. In both cases, you want to explicitly have utils.jl create a module:
module Utils
struct A
# whatever...
end
# more stuff
end
Now you can either:
Explicitly include("utils.jl") once and then in the files you need to use it (which are themselves often included in that same top-level file), you can say using .Utils: A. Note the .-prefix! This is just saying that thereâs a module defined in my current namespace, and I want to access something from it. If youâre working inside another module, youâd use two dots â using ..Utils â to say that you want the Utils module from the âparentâ namespace.
Structure utils.jl itself into a proper package and track it in your project/manifest. Now you can just say using Utils (no dots!) and the package manager will handle loading it only once.
Since you mention this, it is perhaps worth adding the following comment.
When it comes to application development rather than package (library) development, I havenât get come accustomed to how Julia expects me to work.
An application is typically a program with an entry point which uses a collection of libraries. It seems like each of those libraries should be a package, but perhaps not. Perhaps the whole application should be a package.
Julia is sufficiently different from Python that Iâm fairly confident the ideas from Python do not transfer so well to Julia.
For example, to write an application in Python, one of the best ways to structure code is to build a single package, and run python in module mode with python3 -m module_name.
Iâm not sure if you exactly the point I am making here, it isnât that common to see people do this. Most people just use a single python script as the entry point.
This doesnât fit naturally with how Python expects you to structure your code.
It took me ages to figure this out, if you know of any good resources regarding designing structure for Julia projects, please do let me know. It would be great to absorb more information.
I think the key difference between Julia and classic compiled languages like C/C++ (and, to some extent, also Python) is how dynamic the compilation process is.
My mental model for Julia is this: You open up a Julia process, which begins with an empty module Main. Julia then starts to evaluate statements in the context of Main, either from a .jl file (if you ran julia file.jl), or lines you type into the REPL. Any line adds types/functions/constants/submodules to Main, or runs a function. Functions are just names, pointing to a method table. Code loading can be via include, using, or eval. These can affect definitions in the current module or any sub-module, and can add to or overwrite the methods table of any existing functions. Julia keeps track of which functions call which other functions, and re-compiles as necessary if any method table changes.
So, unlike in C/C++, neither files nor modules are things that are compiled as independent entities. Packages do get compiled, but this is pre-compilation, so you donât have to recompile everything every time you start a new Julia process and start loading packages. Itâs an important part of how Julia works, but doesnât really affect the mental model.
There also isnât really anything like âapplication developmentâ in Julia. Itâs always just a Julia process, dynamically compiling stuff as it comes in, potentially exploiting cached compilation / pre-compilation. This might change a bit with the juliac static compilation work thatâs being done at the moment. I havenât really looked too much into that, but I think it basically only adds a specific entry point, so that everything that julia does from that entry point can be written to a static executable (while, as of recently, massively stripping out unreachable code to get the resulting binary to a reasonably small size). For now, though, or for âtraditionalâ Julia usage, you should be thinking in terms of a REPL that dynamically compiles code in the background, rather than C-style compiling and linking.
Such -m switch is already there in 1.12-DEV, with same (or similar?) semantics. Note, in Julia packages (and/or modules?) are precompiled, and scripts are not (by default). See also juliac also available in 1.12. Thereâs also PackageCompiler.jl and more tools to compile Julia binary executables (or libraries), working since before 1.10.
I didnât say use 1.12 for production (Iâve now edited to 1.12-DEV). I just got reminded of -m in 1.12, and I explicitly mentioned what works for sure in (currently supported) 1.10 and 1.11, since the rest is new.
It wasnât too clear I was answering newbies, and this thread will exist for long after 1.12 is released. [I think, but not sure, juliac can be downloaded and used for older 1.11 Julia, but yes, consider it a brand-new/experimental, for now; if you try it out, might as well stay with 1.12-DEV, the nightly. It IS good to learn about the upcoming stuff, just not use in production until released blessed as stable.]
I spoke with a friend of mine who still does C++ on a day to day basis. We came to the same conclusion that this is the real source of the multiple definition issues you can get with C++.
We came to the conclusion itâs because there is a separate compile and link phase, and each source code file is compiled first independently from all others in what is called a single translation unit.
Some more details for those interested.
In C++/C, you have source code files and header files. They are (supposed to be) treated differently. In header files you place things like function signatures which tells the compiler the actual function definition or body of the function will be provided at some later point and that everything will be linked together by the linker later.
The compiler takes your source code, which is a .c/.cpp file and converts it into object code, which the linker will later use to produce an executable. This single source file is called a translation unit.
The problem comes with templates. The compiler sometimes needs to see the exact code which will be compiled to instantiate (create) an instance of a templated function. This is where both of our memories become a bit vague. Itâs difficult to understand what specifically has gone wrong without seeing a concrete example.
But, the point is that this sometimes forces you to put the body of functions in a header file. If that header file is #includeâd in more than one translation unit then you end up with a multiple definition, and have to re-think your design.
Often the easiest solution is consolidate several files into one to maintain a single translation unit. But that can be a pain from a maintenance/repository structure point of view. (Few very large files.)
We (now strongly) suspect Julia doesnât have the same problem. It seems like this problem occurs due to the design of data flow through the C++ compiler pipeline, rather than specifically being a language design problem.
I suppose my question is really only of interest to anyone who worked with C++ in the past. If you never used C++ before Julia you can basically ignore this. The TL;DR seems to be that C++ has a problem with the design of how it compiles code which Julia doesnât have.
Actually - I just had a further thought about this.
I believe I am correct in thinking that Julia can sometimes be made to print a warning message about replacing one existing module(?)/function(?) name with a new definition?
I think that parallels the C++ problem. The C++ compiler will reject your code if you end up in this situation. I think Julia accepts it and just replaces an older definition with a new one.
Is anyone able to provide an example of this behavior? I donât know enough about Julia to replicate it for myself yet.
julia> module A end
Main.A
julia> module A end
WARNING: replacing module A.
Main.A
Itâs not delineated how youâre thinking. The problem is reassigning âconstantâ variables, even ones that are implicitly so:
julia> A = 3
ERROR: invalid redefinition of constant Main.A
Stacktrace:
[1] top-level scope
@ REPL[27]:1
julia> const x=1
1
julia> x=2
WARNING: redefinition of constant Main.x. This may fail, cause incorrect answers, or produce other errors.
2
julia> x = 3.5
ERROR: invalid redefinition of constant Main.x
Stacktrace:
[1] top-level scope
@ REPL[31]:1
julia> struct X end
julia> struct X x::Int end
ERROR: invalid redefinition of constant Main.X
Stacktrace:
[1] top-level scope
@ REPL[3]:1
So for the most part, youâll get an error because itâs dangerous when constants arenât constant, especially when it messes with the type system. Youâre allowed to make changes with a warning if you reassign to an instance of the same type except for type-types, but youâre also accepting the risk of previously defined and compiled code still using the obsolete instances, including entire modules.
Itâs worth highlighting when itâs not a reassignment.
julia> foo() = 0
foo (generic function with 1 method)
julia> foo(x) = 1
foo (generic function with 2 methods)
julia> struct Z end
julia> struct Z
Z() = 0
end
In both cases, the subsequent definition does not reassign the constant variables, theyâre just adding or replacing methods. This is pretty important for interactivity in a multimethods language. When it comes to precompilation of packages, however, you get errors for method overwriting because behavior can easily be determined by an uncontrollable order of package loading. That has been likened before to linker warnings when loading 2 C libraries defining the same symbol, but the exact thread escapes me at the moment.
There are quite a few things you canât or shouldnât do during precompilation that are fine afterward, I imagine youâd see a few more parallels to AOT-compiled languages. Note that the linked documentation section will say âmodulesâ when it means âpackagesâ in context; modules evaluated into a Julia session wonât be precompiled so they donât run into any of those issues.
Yes, thatâs along the lines of what I was thinking. Does this mean if you have two modules with the same name, you canât include both at the same ânamespace levelâ.
In Python you can do import X as Y to get around this problem. The problem here however is that in Julia import/using are different statements to include whereas in Python import does both things as a single step.
I would guess Julia provides a solution to this, somehow? I donât recall reading about it in the docs. I know you can use ModuleName.functionName to distinguish between two different functions with the same name defined in two different modules.
This is also a bit off. In both Julia and Python, the codeâs environment determines what package (library) is associated with a symbol in imports; import X as Y will only find one package X in the environment, the as Y is not disambiguating anything. as does help disambiguate different objects in different modules that happen to use the same symbol, e.g. from X import foo as bar and from Y import foo as baz. Note that X.foo and Y.foo are still usable; thereâs no issue with the same symbol in 2 namespaces having different roles, itâs only a conflict when a 3rd namespace tries to import those 2 roles with the same 1 symbol.
julia> module A
module B end
module C end
end
Main.A
julia> module D
using ..A: B # D borrows A.B
module C end # different module
end
Main.D
julia> using .A: B
julia> using .D: B # allowed because same home module and symbol
julia> using .A: C
julia> using .D: C
WARNING: ignoring conflicting import of D.C into Main
julia> C
Main.A.C
julia> using .D: C as C2
Again a bit off, so it might be worth breaking down whatâs happening. Juliaâs include evaluates source code in a file; it goes through the whole process of text â AST â runtime objects. The same way that evaluating println("hello, world") 2 times prints 2 lines, evaluating a file containing a module expression makes 2 modules; doing so in the same namespace is a discouraged conflict. In both Python and Julia, imports let different modules (namespaces) trade objects via symbols and renames. A Julia module could either be a package installed into the environment or a previously evaluated module expression within the same session or package source; you can tell the difference at the import because the latter has relative dots. Python has similar packages, but its other modules are instead encapsulated by files. The sessionâs first import of a package or Python file usually loads compiled code; evaluating source is only needed if the compiled code is outdated. Subsequent imports in the session are just making references to the loaded code.
Things in interactive-first languages tend to work very differently from AOT-compilation-first languages, itâs reasonably difficult to draw parallels.
Julia doesnât run into this problem, because you donât need to include source in order to instantiate parametric types or specialize generic functions. You just import the module.
In consequence, itâs rare in Julia to include a file more than once. You typically structure your code into one or more packages, which is broken up into one or more files and sub-modules, and each file is included only a single time â include is used more like a Makefile to tell Julia what code makes up the package, in what order. You re-use code by importing modules (either submodules or other packages), not by include-ing files.
include trivia: if include is a normal Julia function, how can it possibly figure out which module it was called from?
The trick is that each module M implicitly contains an include(file) = Base.include(M, file) definition. This means that while A.sin == B.sin is true, A.include == B.include is false (same for eval). There are hundreds of distinct include-named functions in a typical julia process.
Except for Core and modules defined with the baremodule block such as Base. The docstring for baremodule actually points out module blocks automatically define a personal include to shadow Base.include, though itâs inaccurate because it only illustrates one include(p) method when there is a 2nd include(mapexpr, x) method.
This actually throws me off sometimes when I reach for include_string and realize I need to provide a module.
Elsewhere, I was told that this isnât what Juliaâs include does. I was told that it does a bit more sophisticated version of the C++ preprocessor text substitution of #include. Let me see if I can find that.
It may have been here actually, Iâve somewhat lost track.