If you define a type / struct in one file (here typetest_types.jl):
module Types
struct TestType end
end
and then use it in a separate file in different modules, both including typetest_types.jl, then one module can not run methods defined on the same type in the other module:
module A
include("typetest_types.jl")
using .Types: TestType
f(x::TestType) = println("☼")
end
module Test
include("typetest_types.jl")
using .Types: TestType
using ..A: f
function test()
f(TestType())
end
end
As evidenced by attempting to run Test.test():
ERROR: MethodError: no method matching f(::Main.Test.Types.TestType)
Closest candidates are:
f(::Main.A.Types.TestType)
Why are A.Types.TestType and Test.Types.TestType not recognized as the same type?
I know that this can be fixed by importing TestType from A instead of from Types, but are qualified names always distinct even if they point the the same thing?
Additionally (I didn’t test this), doesn’t this lead to a problem similar to the Diamond Problem in OO?
That’s because in my real use case, modules A and Test are in different files as well. So I’d have to include the file of A only.
However, this breaks when A is not the only module defining functions for TestType. Suppose there is another file typetest_B.jl:
module B
include("typetest_types.jl")
using .Types: TestType
f(x::TestType) = println("B")
end
And the module Test changes to:
module Test
include("typetest_B.jl")
using ..A
using .B
function test()
A.f(A.TestType())
B.f(A.TestType())
end
end
Then A.f executes but B.f fails with the same MethodError. Calling B.f(B.TestType()) works, but having to manually keep track of which method takes which name seems weird, especially when not using the qualified name for the function call (as opposed to this example here). How do big packages mange their type definitions and internal imports/includes? Is every module just published as its own separate package?
Don’t include a file more than once. Files and modules are not linked to each other; if you include a file twice, you define a new, distinct version of whatever is in the included file. It’s similar to how you don’t #include the same header file twice in C or C++.
Most packages I know of that split their architecture into multiple files with submodules have one include per file in their top level module, and use the dedicated namespace traversal syntax (like using ...A.B: foo) to access their types.
If I understand correctly this would be hard to adopt for a collection of loosely related files, where different tasks depend on the same types, as everything would have to be imported every single time. For example, one module might be responsible for reading data from a file and parsing it to a custom type, then analyzing that by using big packages for nonlinear fits. Another module might use the same type but capture new data and then save it to file using binary file packages. Both need access to the underlying type and use common functionality defined in a third module, but I wouldn’t wanna run all using statements for every package in the project just to do a small task on some shared type. Any advice for a use case like this?
You don’t need to run every using of your entire codebase in every module, just what you need for that module. Avoid deeply nested/convoluted module hierarchies; you don’t need one module in each source file.
If the code is all related to working with that one type, why split it up into multiple modules? If it’s all operating on the same types, the code doesn’t really sound “loosely related” to me.
That’s what happens if you have one central file includeing every other one though, right? Isn’t that what you indicated big packages do?
The parent/sister module referencing is nice, but is there a way do conditionally include a file, like a C include guard?
Because I have several types representing different data with a module each for analyzing that one type, but then have different modules that combine several or all types for compound analysis.
In one compound analyis I wanted to use functions from the specialised modules as well, that’s how I ended up with MethodError and started this post.
What I mean is that you don’t need a using Foo or similar if the module that’s written in doesn’t actually need anything from Foo. You of course still need that for stuff you actually need in that module.
Only for the things they actually need there. I’m also saying that I’ve not seen a codebase split up in the way you describe. What advantage does doing this offer you as a developer?
No, I don’t think so. I guess you could emulate that by wrapping your entire file with an if that checks for definedness if the module, but that doesn’t actually solve your problem once you have slightly deeper module hierarchies. Again, modules are an explicit namespacing mechanism the languages provides for you to solve this exact problem.
To me that sounds like you’re using too many modules with a very narrow focus. It’s admittedly a bit difficult to give specific advise without having code at hand, but I currently don’t follow what advantage you get out of that level of namespacing.
Okay, I guess that’s a lesson learned for my next research project. I mainly introduced all the modules so I could reuse function names around different files.
However, getting rid of the modules leads to redefinition when doubly includeing files, right? So instead of the original MethodError, you would see a bunch of the “conflicts with existing identifier” warnings. At least thing would run then…
Again, you don’t need to doubly-include anything. includeing a file C-style is not the same as namespacing/giving access to the same name. If you have file A, B and C and you include B and C in A, both B and C will see each others definitions. If B and C contain modules, you can traverse the namespace with e.g using to get at those definitions, no additional include required.
That’s what I mean when I say “only place include at the top level module/file”, which in this example is file A.