Find module name/scope via reflection

Not even sure the title is correct… No time should be spent trying to infer usefulness from the below, totally contrived as MWE. However, it does make the point about the problem I’m having in my real code.

I have a module, Data, which has a parametric composite struct, DataContainer. I don’t include any using statements in Data (that relate to the parametric types of DataContainer), even though in reality the parametric types are indeed just one of a number of other modules (Type1.jl and Type2.jl): the specific choice is decided on at run time by user conditions. By design Data simply does not know which one, so using/importing/including is not meaningful/possible within the Data module. The Data module implements a function, Allocate (called from Driver), which takes an instance of DataContainer, and for one of DataContainers fields, data, I need to call another function within Allocate, called AllocateSub. AllocateSub takes an instance of the field type, data, as one of its arguments. AllocateSub is implemented in the module that defines the parametric field type (Type1.jl and Type2.jl). Problem is julia wants me to “using” the module (which I can’t as a I don’t know which one it is).

Is there a way in code (maybe via reflection) to obtain the module that holds the relevant function given just the field type of a parametric struct?

How do I get round this? I repeat the solution here, that I’m looking for, is not adding using statements into Data module. My code fails with V1 commented in, if you comment that out and comment in the two blocks relating to V2 it works: V2 is suboptimal for me. As I note in the comments of Driver.jl. I am happy for Driver to know about the parametric types, but I don’t want Data to have to know. This is a question to learn how to do this: in reality this is happening a lot in my code as I’m separating specific types from interfaces that use the types, so I need to resolve this.

The main user code works with interface type modules (like Data), and only the setup code (a more complex version of Driver) makes the decision once, which parametric types (Type1 or Type2) to instantiate the interface type (Data) with. The rest of the code then runs with the interface types and the functions associated with the interface. In short if this can be done, I’d like to avoid where possible a discussion on a different design, I only want to re-write my code if this is not achievable: that is if there is no way round having to add a bunch of using statement within the interface type (Data module).

Thanks,
Andy

Can’t find a way to upload a file, so pasted them here:

Type1.jl

module Type1
export Something

struct Something
    name::String
    field::Vector{Int64}
    Something(n::String) = new(n, [])
end
#
function AllocateSub(d::Something, size::Int64)
    resize!(d.field, size)       
end
#
end

Type2.jl

module Type2
export SomethingElse

struct SomethingElse
    name::String
    field::Vector{Float64}
    SomethingElse(n::String) = new(n, [])
end
#
function AllocateSub(d::SomethingElse, size::Int64)
    resize!(d.field, size)       
end
#
end

Data.jl

module Data
export DataContainer, Allocate
# comment in when using V2 only
# using Type1
# using Type2

struct DataContainer{DataType}
    name::String
    data::Vector{DataType}

    DataContainer{DataType}(name::String) where {DataType} = new(name, [])
end

function Allocate(data::DataContainer, outerSize::Int64, innerSize::Int64)
    resize!(data.data, outerSize)       
    T = eltype(data.data)
    for i in 1:outerSize
        data.data[i] = T("stuff")
    end
    #
    # V1) want this to work, comment out when using V2
    #
    for i in 1:outerSize
        AllocateSub(data.data[i], innerSize)
        #something like, function_that_gives_module_that_owns_T.Allocate(data.data[i], innerSize)
    end
    #
    # V2) comment out the above, use this instead, also comment in two usings above
    
    # for i in 1:outerSize
    #     if T isa Something
    #         Type1.AllocateSub(data.data[i], innerSize)
    #     elseif T isa SomethingElse
    #         Type2.AllocateSub(data.data[i], innerSize)
    #     end           
    # end    
end

end

Driver.jl

module Driver
#
thisdir = dirname(@__FILE__())
any(path->path == thisdir, LOAD_PATH) || push!(LOAD_PATH, thisdir)
#
using InteractiveUtils
#
using Data
#
# happy for this module to be aware of these two types,
# not happy for Data module to be aware of them
using Type1
using Type2
#
function DoStuff(user::Bool)
    if user
        data = DataContainer{Something}("Something")
    elseif !user
        data = DataContainer{SomethingElse}("SomethingElse")
    end
    Allocate(data, 5, 3)
    @show data
end

DoStuff(true)
DoStuff(false)
end

The problem you’re having here is that AllocateSub is two separate generic functions. One inside module Type1 and one within Type2. Although these have the same purpose, they don’t share method tables, and can’t be in the same scope at the same time.

Instead, you should define AllocateSub in the Data module (or a separate utility module if necessary), then import Data.AllocateSub into the Type* modules and add methods to it.

To define a generic function with zero methods methods inside Data module:

function AllocateSub
end

Then, inside Type1:

import Data: AllocateSub

# Here you are now extending the AllocateSub generic function to know about data types from the `Type1` module
function AllocateSub(d::Something, size::Int64)
    resize!(d.field, size)       
end

Now Data doesn’t know about the types it will be using, but it does know about how to manipulate them.

Is it possible to add a parameter to function Allocate in module Data.jl?

function Allocate(data::DataContainer, SubAllocate::Function, outerSize::Int64, innerSize::Int64)

By the way, a couple of matters of style -

  • There’s a pretty strong convention in julia code that Types are UpperCamelCase, and functions are squashedcase (or snake_case for readability). Of course you can deviate from that if you like, but I’d call the function above allocate_sub, or allocsub to be consistent with the bulk of julia code out there (or preferably something more meaningful to whatever your actual code is).
  • DataType is already the name of a builtin type. This doesn’t prevent you from using it as an identifier but it might be confusing.
2 Likes

I had fundamentally missed this it terms of understanding when teaching myself Julia! Quite specifically what is this called so to read about? As in not just overloading a method but the approach of putting a stub into the Data module and then it looks like I’m actually adding methods back into the Data Module via the other modules? Is that the correct? What is that called? Rhetorical: what else have I missed…
Coming from 15 years of Cxx. My Julia style still needs work…

Thanks again,
Andy

Not sure I can point you to more specific documentation than just what you find when you look for “method overloading”, but maybe the following example is enlightening:

julia> struct Foo; n::Int; end
julia> import Base: +
julia> +(a::Foo, b::Foo) = Foo(a.n + b.n)
+ (generic function with 161 methods)
julia> 1 + 2
3
julia> Foo(1) + Foo(2)
Foo(3)

In this way, anyone can add methods to the + function. @c42f’s suggestion comes down to doing the same thing for your own functions: define the function at one place, and import it everywhere where you want to define methods for it.

1 Like

Yes, this is correct. The key concept is that each generic function (and in fact, any binding) implicitly has a module. Think of it as a table in that module, to which you are adding slots. Each table has a home, and tables can have the same name in different namespaces (modules).

The module docs are worth reading, but perhaps could emphasize this point further. To someone who used a language with multiple dispatch, this is one of those “how could it be otherwise” things, but I understand that coming from Cxx, it can be surprising.

I find it is very helpful to reread the manual from time to time. I always learn new things, even after 3 years.

More or less, though I think it’s more correct to say that you’re adding methods to a generic function called AllocateSub which just happens to have originated in the Data module, but can also be bound to a name in other modules. This is what’s happening underneath at runtime: name resolution finds the correct generic function object and the code for the new method is added to the method table associated with that object. Modules are only involved in the name lookup part but don’t hold the method tables.

It’s probably also worth reading the methods documentation in the manual. It’s got a section on “empty generic functions” which is the name for this kind of stub, but that doesn’t discuss the possible designs you can achieve with them.
https://docs.julialang.org/en/v1.0/manual/methods/#Empty-generic-functions-1

Rhetorical: what else have I missed… Coming from 15 years of Cxx

That’s a hard one :slight_smile: It took me probably a couple of years or so to feel reasonably comfortable designing code with multiple dispatch in mind (coming mostly from a C++/python/matlab background). I wish I could easily summarize the lessons learned along the way.

From the perspective of a heavy C++ user, the design of julia code is quite similar to designing for static dispatch in generic (template heavy) C++ code where you could say that ADL gives you a kind of multiple dispatch, but only at compile time. One big pain point of writing this kind of code in C++ is having to decide up front which parts of the program will be statically typed and which parts will use dynamic dispatch. Julia frees you from this conundrum and has far lighter weight syntax for generics, to the extent you often wouldn’t realize you’re writing generic code at all.

1 Like