The 4 kinds of Metapackages / API / Higher level packages

I was thinking about this,
we have as I count it, 3 different kinds of Metapackage/Higher level package.
Though the term Metapackage is used only for the last 2.
I think especially for Class C maybe?

Class A: *Base packages / Namepace packages / Interface packages

E.g. StatsBase.jl, SklearnBase.jl, LearnBase.jl, ColorTypes.jl, RecipesBase.jl, MathProgBase.jl
These packages primarily exist to provide a set of types, and a set of function names in a common namespace.
A whole pile of packages (often with in an org use them), to achieve compatibility, and consistency.
By using the common namespace, one avoids name-collisions.
Further the functions come with a set of conventions that are expected to be met when you implement the functions.
StatsBase.jl and SklearnBase.jl actually contain an overlapping set of names, but they have different conventions (SklearnBase.jl is RowMajor and other things). One can actually implement both of them (one of the good things about namespaces).
The types they provide are often abstract, and the functions are often without methods.

A secondary purpose of these packages is to provide useful helpers for packages that build on them;
normally using the types and functions defined in the package.
Taken too far and this can can cause problems, for example StatsBase.jl is quiet heavy to install because it depends on a number of packages including the large DataStructures.jl.
In general this kind of package should have very few, or no dependencies. Maybe Compat.jl and/or another Base package.
Adding too much functionality also moves away from describing an interface and into just being a normal package, that other packages often extend.

Class B: Multi-backend / Front-end Package

E.g. JuMP, Plots.jl
These packages expose some useful functionality, and a standard API.
It does some real work and has real-non-empty functions.
But they let the bulk of the work be done via some backend package, which they make it possible to swap-out.
JuMP.jl facilities this via all its backends implementing the Class A interface package: MathProgBase.jl,
Plots.jl does not, it uses Requires.jl, and some other shenanigans
WordTokenizers.jl is sort of down this line, but all of its back-ends bar one, are contained in the package itself.

It is a notable trait of these packages that they have many optional dependencies

Class C: Standard Environments / Metapackages

E.g. DifferentialEquations.jl, MLDataUtils.jl, proposed Stats.jl
This kinda package is built around Reexport.jl,
has a large(ish) number of dependencies, which it reexports.
The idea is that the user just imports this packages and they have a full batteries included environment for doing a task.
Many users will not even realize the package is actually made up of parts – or at least they won’t have to care.

Class D: Glue packages

I am not sure these really exist. I can’t bring any to mind.
I’ve seen glue packages argued for in the past, which were to serve a similar purpose to what seems to be what Requires.jl is used for now.
For optional dependency of A upon B, have a glue package that depends on both and adds the functionality.

Thoughts?
Better names for these?

13 Likes

Class E The Extendable Monopackages
E.g. LightGraphs.jl

This is kind between a Class A Base package, and a normal package.
Unlike a Base package, it actually does something on its own.
It is extensively extendable, and extended by child packages
This kind of pattern could be used to provide core functionality,
that child packages plug-in to.

I like your ideas on classifying types of packages. It might help people decide on what are best practices for creating an ecosystem of packages (which is a pretty cool thing in Julia).

I actually split things up further in the Str* packages in the JuliaString org, than what you discuss for Class A.

StrAPI provides the functions (empty), and using ModuleInterfaceTools sets up the API in a way that other packages can easily extend the API at different levels (i.e. functions/types that are part of the public API for users, and a development API, needed for packages that wish to build their own extensions) (as you recommend, it has no other dependencies, except for the very small ModuleInterfaceTools).

CharSetEncodings provides the CharSet, Encoding, and CSE (Character Set Encoding) parameterized types, helper functions for working with those types, some basic traits that can be used with them, as well as setting up a small set of predefined CharSet, Encoding, and CSE concrete types (for ASCII, ISO-8859-1, UCS-2 Unicode (16-bit, no surrogates, BMP only), full 32-bit UTF-32 Unicode, as well as validated UTF-8 and UTF-16). This only depends on StrAPI.

ChrBase and StrBase are provide the basic string functionality, but they depend on StrAPI to provide the APIs, and CharSetEncodings to provide the types.

Then there other packages such as StrRegex and StrLiterals add extra functionality to StrBase.
StrLiterals is a class of package that you haven’t really discussed, in that it allows for plug in extensions to it’s behavior. Simply by using an extension package, which can add one or more item to a dictionary from StrLiterals, in it’s __init__ code, you can seamlessly add new format sequences.
StrFormat adds format sequences to handle C printf style formatting, Python-like formatting, as well as a “Julian” form that gets defaults based on the argument type(s), and accepts keyword arguments, all within the string literal.
StrEntities adds in 4 different types of entities, LaTex and Emoji entities, similar to the ones at the Julia REPL, as well as HTML and Unicode entities, in the same way, by adding four handlers to extend the package simply by doing using StrEntities.
In some sense these are like your Class D glue packages, StrLiterals doesn’t have the large depencies, but StrFormat can plug in the code from Format, and StrEntities pulls in the packages and loaded tables from StrTables, Unicode_Entities, LaTeX_Entities, Emoji_Entities, and HTML_Entities.
Edit: now that I see your follow on message, maybe these could be considered class E, because they don’t add any new function names, types, or macros, they simply extend the way their base (in this case StrLiterals) acts.

Finally, Strs is what you refer to as a class C package, it pulls in all the other packages I mentioned above (as well as a nice InternedStrings package, even extending its string literal macro i"..." to handle all of the extensions that are present with StrLiterals, returning a nicely interned Str type string (although that’s kind of a bit more like the idea of your Class D, adding extra functionality from combining different packages).

I’m horrible at naming, but I do think nobody will remember which one is A, B, C, or D.
Here are my ideas:

  1. Interface packages (like my StrAPI)
  2. Type/Trait packages (like my CharSetEncodings)
  3. Base packages (like your Class A, minus the parts I’d put into separate Interface & Type/Trait packages, i.e. my ChrBase & StrBase)
  4. Extendable packages (like my StrLiterals, your class E?)
  5. Extension packages (like StrFormat and StrEntities)
  6. Front-end packages (i.e. your Class B)
  7. Meta/Environment/Ecosystem packages (your Class C, like my Strs, and DifferentialEquations
  8. Composite packages (your Class D)

Throw your tomatoes :tomato::tomato::tomato::tomato::tomato: at will! :nerd_face: