@erichanson recently linked this discussion in Beacon’s Slack, and we were discussing how interesting it is to get a window into other orgs’ approach to this sort of thing. It made us realize that it might be useful for us to share a bit more about Beacon’s approach, so I wrote this up post-feast yesterday - hope it’s interesting!
It’s probably important to preface the rest of this content with some background on Beacon’s motivation for a monorepo in the first place, because that motivation pretty directly informs our apporoach. I make no claims that our practices will necessarily be applicable to folks in different environments
At Beacon, we actually generally default to a multirepo approach for all independently useful library-style packages, but all of the production services/applications that underpin our core platform are housed in a single monorepo. Not all of the code in this platform monorepo is Julia, but a good chunk of it is.
Paraphrasing from a relevant section of Beacon’s architecture journal:
We’ve chosen to develop our platform in a monorepo fashion to start, but reserve the option to break it out into a multirepo configuration at a later time. Our platform consists of multiple “components”: loosely coupled services/applications all developed in accordance with the same CI/CD configuration, infrastructure, and engineering practices.
We’re building multiple different components with a comparatively small team. By developing in a monorepo from the get-go, it becomes a bit easier for us to…
- …impose/enforce uniform cross-component practices/structures across the codebase
- …rapidly prototype new cross-component structures across the codebase
- …lower potential cross-repository synchronization overhead during a development period in which different cross-component boundaries are still being explored
As our platform grows, each given component’s API boundary matures and dedicated teams may evolve around specific components. When a given component reaches that point, we may choose to split out a matured component into its own repository in order to enable its team to function more independently.
In other words, we started with the monorepo route moreso to incubate greenfield development efforts executed by a small team and provide a low-overhead ramp to architectural maturation, rather than out of a desire to long-term opt into (and/or optimize for) the tradeoffs traditionally associated with a full-blown monorepo paradigm.
For the most part, our top architectural priority when we started was to design/implement ideal cross-component API boundaries, and we figured a monorepo structure would give us the optimal environment to rapidly prototype (and battle-test) such boundaries without incurring additional overhead associated with “baking down” a given architectural scheme into a given repository structure. By now, these boundaries are pretty much drawn/stable, but we still haven’t broken out the monorepo into a multirepo since a strong enough need hasn’t arisen that would drive us to do so. Probably will eventually do so, though.
I wanted to explicitly call out this background here, because it affords Beacon some leeway that might not be afforded to a team that was aiming for a “much more monorepo-y” approach to their monorepo. For example, our monorepo relies on Beacon’s internal package registry, which lives in its own repo, not in the monorepo (though I suppose it could, if we desired that).
With that background out of the way, here are the actual relevant Beacon practices that I thought it’d be valuable to share, heavily paraphrased from Beacon’s internal documentation. Beaconeers might note that I’ve added/removed details as needed for a general audience, and have translated a few of our language-agnostic practices into Julia-specific manifestations of those practices for this post.
Each Julia package developed at Beacon (including those in our monorepo)…
…should be developable/deployable, testable, versioned, released/registered, and documented independently of other packages, for some reasonable interpretation of “independence” (hopefully well-enough characterized by the rest of the points). Importantly, no package should depend on another package’s encapsulated implementation details, only on documented APIs.
…whose version is >=
v0.1.0may only depend on Julia packages that are registered in a package registry (either
General, or Beacon’s internal registry).
…must declare compatibility bounds for all non-
Basedependencies in their
Project.toml. At a minimum, dependencies must be upper-bounded at their most recently released major version (or minor version, if the dependency is in a
…should not contain a
Manifest.tomlchecked into version control, if intended to be used as a “library-style” dependency of another package (i.e. it’s sole purpose is to provide code that should be directly invoked from within other code, and doesn’t back a standalone service/application). This forces developers (and more importantly, CI) to independently resolve the package’s dependencies, which is more consistent with downstream environments in which dependencies will be independently resolved without regard to your personal
Manifest.toml. This practice also prevents a shared
Manifest.tomlfrom accidentally masking reproducibility issues with the package’s declared compatibility bounds in freshly-resolved downstream environments.
…should not contain “checked-in direct filesystem-level dependencies” on any content that is not owned by the package. All such dependencies should instead be intermediated by explicit APIs and/or proper package management. For example, a package should never directly
includea script/file that lives outside the package. Another example, which touches on the previous
Manifest.tomlrequirement: Imagine you’re developing the Julia packages
A’s dependency on
Bmust be declared against a registered/released version of
B, not declared as a filesystem reference via
Pkg.dev. Package authors may still utilize
devlocally for development purposes, of course, as long as they do not check in a
…should maintain its own unit tests, and - if useful, especially for application packages - integration tests. Each package’s unit tests should be runnable/passable via
Pkg.testin CI without requiring a checked-in
Manifest.toml. Each package’s integration tests (if such tests exist) should stress interaction points with targeted direct upstream components; these tests should not target indirect upstream components. In other words, test against your dependencies, not your dependencies’ dependencies.
Note that in practice, the composition of these rules can cause cross-package changes to require multiple PRs - something that a more monorepo-centric team may seek to avoid. For example, if I have
A depends on
B, then it takes at least two PRs to land a breaking change in
B and propagate it to
A PR is opened/merged which implements the
Once this PR has landed on
Bchange is tagged/registered with Beacon’s package registry
A second PR is opened/merged which propagates the change to
Once this PR has landed on
Achange is tagged/registered with Beacon’s package registry
We consider this a desireable feature of our approach, but YMMV.
I can’t speak for how widely applicable it is, since I only have “anecdata” from within Beacon, but I hope this post is at least interesting/helpful to folks who are curious about how Julia code can be internally packaged within an industry setting. For us, at least, I feel that following these rules - which isn’t always easy to do - generally allows a bunch of other things to “just” work.
One last thing I’d like to note: I’m consistently blown away by how
Pkg’s thoughtful design supports so many organizational configurations so cleanly, as long as you “go with the flow” of its design. At this point, it’s hard not to feel like other packaging tools/ecosystems are a PITA by comparison. A big shout out to all of
Pkg’s authors/contributors for such a great ecosystem-empowering tool.