We Do Need Exportable New Structs and Functions in Extension

I’m not the last man struggling with transfering from Requires pkg to officially supported extension feature for Julia1.9+.

With Requires pkg we don’t care about where a function and a struct are defined because we can easily access them. But when switching to extension feature, we can not access those only defined inside an extension module, to overcome this, one may need to pre-define an empty function or struct in the main pkg.

Let’s think about a using case. Suppose we have a Person module, and this Person has a lot of skills like Driving, Writing and Boxing etc. If Driving is really heavy, Person only needs a small part of it, let’s say driving a car, then we can make Driving as an [weakdeps]. To use speedup function only really implemented inside PersonDrivingExt, we need to declare an empty speedup in main pkg Person like

module Person

export speedup
function speedup end

export DrivingCarStatus
mutable struct DrivingCarStatus
    pos :: Coords
    speed :: Real
    function DrivingCarStatus(args...)
        # some construction here like
        new(initpos(xargs...), initspeed(yargs...))
    end
end

end # module Person

and then import and implement speedup in extension

module PersonDrivingExt

using Person, Driving
import Person: speedup, DrivingCarStatus

function speedup(status::DrivingCarStatus)
    status.speed += 10.0
end

end # module PersonDrivingExt

So far it is acceptable doing those. Well some issues may occur :

  1. :joy: what if some types are only defined in Driving, like ::Coords inside DrivingCarStatus
  2. :joy: what if some functions are only defined in Driving, like initpos and initspeed in constructor

when these issues happen, we can using Driving inside Person to solve, but it 's far away from extension’s original purpose. "Extension is mainly for extending functions already defined in main pkg" is apparently not enough and not convinient.

On my opinion, extension is good for compatibility control for weakdeps, but if Requires pkg can itself control the version of weakdeps, we could eassily merge some functionalities from heavy pkg dependency without changing much code while not introducing the extension feature from Julia1.9+. Treating a conditionaly-triggered-extension like an nature module would be nice, e.g.

module Person

# some Person's own functions and types here

end # module Person


# ======================================================


module PersonDrivingExt

  using Person, Driving
  export speedup, DrivingCarStatus

  mutable struct DrivingCarStatus
      pos :: Coords
      speed :: Real
      function DrivingCarStatus(args...)
          # some construction here like
          new(initpos(xargs...), initspeed(yargs...))
      end
  end

  function speedup(status::DrivingCarStatus)
      status.speed += 10.0
  end

end # module PersonDrivingExt

What new opinions do you have ? :smiley:

Full disclosure, I haven’t needed to write a package extension, yet. But first impressions, importing names originating in an extension sounds like a nightmare. I wouldn’t want to do using Draw, Rainbow: DrawRainbowType to get a name DrawRainbowType that belongs to neither Draw nor Rainbow alone. Extensions can’t be imported, just implicitly loaded, so if you somehow work around that to export new names, there’s no good way to organize them. There’s no reasonable choice of one package over the others when categorizing names into module instances (parentmodule(DrawRainbowType), names(Draw)) or documentations. If names were associated with and documented for the extension separately, I would rather that be an explicitly importable package DrawRainbowPlus that depends on Draw and Rainbow; this is much more accessible and a lot less misleading. After all, modules are Julia’s namespaces, so it’s intuitive that names that don’t originate in 2 spaces belong to a 3rd space that is handled the same. On the other hand, mixing names originating in Draw and Rainbow is more appropriate in an implicit extension than in a 3rd package dedicated to type piracy and re-exports. Making the unreasonable choice of one of the packages to export new names is not good either; it goes against modular programming principles for a module to expose conditionally existing names, especially if they depend on other modules’ independent existences.

There is a documentation issue describing how to access extension-internal names, and you optionally bind to new variables instead of importing names. It looks awkward enough that it doesn’t seem intended for general usage. If it helps, here’s an example of how new variables differ from imported names, without involving extensions.

6 Likes

Thanks for your reply. :smiley:

As to this statment:

Extensions can’t be imported, just implicitly loaded

I’d like to wonder “is it designed like this ?” or “is it techenically can’t be implemented ?” .

Age before Requires.jl pkg, pkg developers usually avoided loading a heavy dependency by creating another pkg, e.g. CuYao.jl, so when users had no nvidia-cuda device supporting, they could just use a cpu version Yao.jl. At the namespace of Main Module, Yao and CuYao are at the same hierarchy, we couldn’t say Yao is parent module and CuYao is child module. Bad thing to an end-user is that one has to read two pkgs’ documents when just using CuYao. Worse things would happen if we want MetalYao, AMDGPUYao, TPUYao, PlotYao, and WebYao etc.

Still take CuYao for example, there are two kinds of relationships between CUDA, Yao and CuYao at my point of view :

  • :eye_in_speech_bubble: All three modules are of the same level just like three brothers. So creating a new function or a new type inside CuYao is easily understood as: CuYao learned some good skills or belief from both CUDA and Yao, then formed his own ones.

  • :eye_in_speech_bubble: CuYao is a child of CUDA and Yao. So creating a new function or a new type inside CuYao is easily understood as: some gene (type) of CUDA + some gene (type) of Yao → new gene (type) of CuYao.

Combing two pkg’s features to create new features is a common thing, we can still make it working by creating some empty API functions in Yao and then implement them in CuYao, but it’s difficulty to new structs that have types from CUDA or construtor having functions from CUDA for Julia1.9+.

As a end-user, I hope CUDA is just a trigger of CuYao, the version of CUDA is controlled by Yao, and to implicitlly load CuYao we put it in ext dir of Yao. In this way, user reads the combined hand-wirtten-markdown-docs in Yao without noticing the existence of CuYao.

PS: I’m not a user of Yao pkg, just use the three pkg names for convinience.

1 Like

My guess is that it is by design because Requires.jl already almost implements it; despite not being its intention, you can reexport names originating from a @require expression, and the expression can import a precompiled glue package. It thus seems very possible that extensions could have exported names, but the opposite direction was taken by making extensions not importable. I believe the reason is to avoid exposing conditionally existing names, which start to defeat the purpose of modules.

Again, I think that CuYao should be an explicitly imported package that owns the new names and depends on CUDA and Yao, and it is. Yes, it is shorter to write using CUDA, Yao than it is to write using CUDA, Yao, CuYao (well, CuYao already reexports Yao so Yao is redundant), but its apparent convenience actually invites confusion. Say my package implicitly loads CuYao with using CUDA, Yao or import CUDA, Yao: curand_state; this is already terrible because specifying names in a multiple import was disallowed due to ambiguity e.g. import Dates, BenchmarkTools: @btime. A user later checks parentmodule(curand_state), and one of two things can happen: 1) They get an implicitly imported CuYao, but they can’t find import CuYao or module CuYao in my source code or see a CuYao dependency in my Project.toml until they stumble down a long list into CUDA and Yao’s Project.toml files, 2) CuYao is not an importable name at all, in which case they might get an unhelpful nothing. By saving writing CuYao, it became much harder to follow where names come from.

Hey Benny, I think end-users would not care much about where the name comes from in the source code, instead they more need a well documented API about this name (new type or fn), and this can be wirtten into the official docs just like PyTorch docs.

I think make/pretend Yao as if it’s a whole fully stand alone pkg controlling most things it’s clear/clean to end-users, even Yao may have lots of helping pkgs hidden behind, like MetalYao, AMDGPUYao, TPUYao, PlotYao, and WebYao etc. If those helping pkgs are stand alone pkgs, users may feel messy.

Well, I may use two solutions to solve this:

  1. :space_invader: Not using extentions provide by Julia1.9+, instead we keep using stand alone pkg as aid e.g. CuYao. Yao’s official document would say “If you want a cuda accelaration, using CuYao, and if you want to plot something, using PlotYao, blabla…”, then CuYao and PlotYao don’t even need official documents. In this way, end-users need to do using Yao, CuYao

  2. :space_invader: Using a fake extension CuYaoEmptyExt to control CUDA’s version with Julia1.9+, and using Requires.jl to implicitily load CuYao, then CuYao can make new types and even export them. In this way, end-users only need to do using Yao, CUDA. Tricky, isn’t it ? :rofl: :joy: :sweat_smile: :smiling_face:

I have written quite a few extensions by now and would maybe summarise this as

If you enter the regime where you need new structs – the scheme of a new package “between” the two you want to extend, might be a better choice. Basically as soon as something defines its own structs it should be its own package – I think Benny stated the reason: Conditionally existing names might be confusing.

I mainly used it to combine functions from one package (say A.jl) to work on structs from the other (say B.jl) as an extension to A. So if I would have structs from A as well that extend functions in B that would be something for an extension in B maybe?
And for sure these new extensions (a) needed some rewrite of old Requires-based approaches and (b) are still using Requires as a fallback in pre 1.9. But after I got used to the thinking of extensions being for extending functionality (but not defining new functions or structs) – these work really great.

2 Likes

Yeah, the extension feature of Julia1.9+ is probabaly 95% designed for this scenario. New structs is just a corner case out of expectation.

But if creating new struct is satisfied, it would be nice, then at most times we pkg-developers only extending existed functions, but we still give the corner case a chance to live. :smiley:

I would carefully argue that if the code you mean to use as an extension has a new struct inside – it does not fit the scheme. Similar to the case where you try to do OOP in Julia – it can be done but it probably should not.

And sure there are corner cases, where a solution should carefully be discussed.
For example, for my cache (where I have an extension using LRUCaches) I first had a struct working with the cache in the extension. Then I noticed I could do the same struct in the main package, when assuming that one field is “cache-like” and documenting what I expect from that field in functionality. With the extension that field of the struct is then just the LRUCache.

2 Likes

In your example, you may prefer to define DrivingCarStatus in module Person as an abstract type:

module Person

# implemented in PersonDrivingExt
export DrivingCarStatus, speedup
function speedup end
abstract type DrivingCarStatus end

end # module Person

Then in PersonDrivingExt you define DrivingCarStatus as a subtype of Person.DrivingCarStatus and define a constructor method:

module PersonDrivingExt
import Person
using Driving

mutable struct DrivingCarStatus <: Person.DrivingCarStatus
    pos :: Coords
    speed :: Real
    function DrivingCarStatus(args...)
        # some construction here like
        new(initpos(xargs...), initspeed(yargs...))
    end
end
Person.DrivingCarStatus(args...) = DrivingCarStatus(args...)

function Person.speedup(status::DrivingCarStatus)
    status.speed += 10.0
end

end # module PersonDrivingExt

From the user perspective, using Person is sufficient to dispatch on DrivingCarStatus and construct DrivingCarStatus instances.

You can put docstrings in Person so that they are available even when PersonDrivingExt is not loaded.

3 Likes

As a end-user, I care. When I look up code snippets in forums, import statements are often all I have at the beginning to find the official docs. Names and behaviors shouldn’t depend on multiple loaded packages because it hampers reproducibility. Say I am given a code snippet import Plots; Plots.plotyao(...), but it errors when I try it. Turns out the poster neglected to mention the “obvious” using Yao beforehand. Say I am given a code snippet using Render3D; render(...) but it runs incredibly slow. Turns out the poster neglected to copy and paste the using CUDA cell from their notebook.

It is, actually. Who extended who to provide curand_state in using Yao, CUDA, Plots? I don’t think it’s reasonable to require that I read 3 docs sites or project files to figure that out, it should be apparent in the source code, and that’s only possible if each name is provided by 1 package unconditionally. How would I modify a provided snippet to import a specific name? using Yao, CUDA, Plots: curand_state throws an error. Even if you make it work, which package uses Requires.jl to load curand_state, or are they wholly independent packages that don’t use Requires.jl? I can’t tell from the statement at all. Note that Requires.jl unambiguously associates any conditional names with 1 of the packages; should Yao.curand_state be documented in Yao.jl but clarify “this doesn’t exist unless you also imported CUDA.jl”? How many such conditional names can bloat a package’s documentation, and how should they be organized if we generalize to extensions of >2 packages? I’ll import a separate CuYao.jl, it’s so much easier to handle.

Packages are not only a way to load code but also to categorize names; dismissing that to save writing a few package names may be a miniscule convenience to users already familiar with the ecosystem, but it’ll definitely be a massive burden to users learning and sharing the libraries, which doesn’t seem like a good tradeoff.

You don’t, like I’ve said, CuYao imports both CUDA and Yao and reexports Yao. using CuYao is much simpler and clearer than using CUDA, Yao, and specifying names actually works using CuYao: curand_state.

2 Likes