Base.modules_warned_for silently removed in 1.8

A lightweight way to improve on what already exists would be:

Exported-public - anything exported is deemed public

Private - anything not exported is deemed private by default

Not-exported public - need a keyword or macro to mark entities as not exported but are part of the API.

That means the degree of change necessary to implement is restricted to methods you want to be public but not exported, which should be relatively few.

Tooling can warn if private entities are used from another module, rather than have any enforcement by the compiler. Have another keyword or macro that allows it to be suppressed, either for a particular line, block of code, or whole module.

No need to enforce a naming convention that would require ecosystem-wide refactoring of all internal code.

3 Likes

Python at least has more of a culture and tooling (linters, etc.) to support the idea of private methods and attributes. Just one example: Some linters will require you to document a method unless it starts with _.

The same convention is sometimes used in Julia. But, it’s much less widespread, and more poorly supported by tooling.

2 Likes

One thing I can fully agree on is that it sucks when someone contributes code and your library breaks a year later because they snuck in an undocumented internal Base method to make something work but didn’t make a note or add a check to the tests. It makes sense to want some guard against that beyond proofreading code at 100% efficiency.

It just doesn’t make sense to appropriate OOP access modifiers for this, Julia doesn’t have classes for access modifiers to encapsulate. Roughly speaking, OOP encapsulation designates what components of a class are accessible by other classes (public methods and few fields) and what are confined implementation details (private fields, methods, even nested classes). It’s not the same, but a non-OOP encapsulation is accomplished by modules separating global scopes, exporting select names, or importing select names. That’s why exported names came up in the thread as a hint of public API (though still not as authoritative as documentation).

The talk about access modifiers is suggesting that we should apply to multimethods in modules something designed for members in classes. There’s a lot wrong with this:

  1. Internals aren’t confined to their home module, they’re free to be exported within the package to the convenience of the developer and out of the users’ sight.
  2. The unit in which an internal operates is not a module, it’s the package. So you could invent access modifiers on a package level…if you’re prepared to sacrifice the ability to inspect and test internals easily. There’s also a complicated unsettled debate on how to develop and test (or not) private members.
  3. Unlike a singly dispatched method confined to a class, a multimethod’s definition can be scattered, even across packages. What happens when some methods are marked @private and the others are marked @public? This straightforward inspiration is just a footgun.
  4. What about global variables and macros? Macros aren’t even variables or instances.

There’s been some talk about informal conventions like leading underscores, but I’m not sure it’ll solve the issue at hand. It saves someone from checking documentation or exports to know if something is public vs internal, but a proofreader can easily overlook it anyway. Without printing warnings or throwing errors, this’ll only make a very rare mistake somewhat easier to spot. And remember that all our internals would have leading underscores, so when you search the text ._, most of it would be your internals. And if we want to make an internal public in the next version? Time to find-replace-all every file.

7 Likes

Thats’s the first time I hear about this notifications thing. Whats’s the notification mechanism, and how to opt into it?

No need for opt in. Last I checked, it was either an issue in your repo, or a revert of the offending change in Base, provided it was an accidental API breakage, not a change to an internal function that doesn’t have stability guarantees. Other cases are when you test for exact numerical reproducibility of e.g. the default RNG number stream or numerical results based on that.

1 Like

You can also add a PkgEval badge to your readme if you want: ANN: Nanosoldier package evaluation -- with badges!

3 Likes

Possibly, but note that in this particular case, @essenciary himself introduced this code. To some extent, this whole discussion is not unlike complaining about missing toes after purchasing, loading, and firing a footgun, that had FOOTGUN printed on it in large block letters.

Again, I am not necessarily disagreeing — if we had a good technical solution here, it would make sense to apply it. Just want to point out that this is a relatively benign bug, because

  1. it can be easily tested for ex ante (@assert isdefined(SomeModule, :internals_I_used)), and caught by CI,

  2. if it is nevertheless missed, things break immediately and visibly, instead of some silent error that corrupted something and you discover a year later,

  3. it can be tracked down super-quickly (top of the stack trace).

Unfortunately, 99%+ of bugs are nastier than this, because of one reason or another. This is why people do use internals from other modules in Julia as a workaround, in an admittedly rather informal way, and also why there is no great pressure to fix this — it is relatively innocuous as bugs go.

1 Like

@Tamas_Papp as already mentioned, it’s a generic discussion about methodology and the general way of doing things. The issues I raised stand, regardless of the exact specific of this incident. I would appreciate it if you’d keep the discussion at technical topics and abstain from comments about people.

2 Likes

Proposed breaking changes to the language’s design aside, what does Base.modules_warned_for even do? The most I could figure out is Base.modules_warned_for == Set{Base.PkgId}(). Any chance the removed code is simple enough to be repurposed for your package’s own internal code? Seems a bit like plagiarism but maybe this is fair game for internal code? Just seems like a faster option than contributing to the language standard.

It does seem like an odd internal to depend on. It was there as a state tracker to avoid warning about broken package resolves multiple times in some code that was put in place to ween users onto routinely using environments. It didn’t actually work in a lot of cases due to precompilation warnings happening in nested processes, thus the warning state didn’t propagate.

I’m genuinely interested to see why it’s needed, but perhaps that discussion could move to an issue on the package it’s needed in. Feel free to tag me, given it was my PR that removed it.

6 Likes

Long story short:

  • Genie apps have 2 main components - the Genie package and the user app (a module with multiple submodules). These 2 parts cooperate, the user app using features exposed by Genie, and Genie providing APIs to manage user app files.

  • since work on Genie started in Julia 0.5, before Pkg, it was affected by multiple breaking changes in Julia’s code loading system.

  • after the introduction of Pkg, LOAD_PATH was used to load user code/modules by adding their folders to LOAD_PATH and this worked well for many years and for many Julia versions.

  • I think around v1.6 the user code loading started to throw a warning (why would it start to warn with a minor release, no idea) and a user suggested Base.modules_warned_for in order to get rid of the warning.
    (An attempt to figure out what changed here Module correctly loaded from LOAD_PATH but still getting a dependencies warning)

  • That made sense so I introduced the proposed patch - tests passed, and the code worked as previously so no big deal with a warning I thought.

  • Yes, I personally introduced the patch as some people here uselessly spent time to dig - sorry, I didn’t mean to mislead, I remembered it was a PR but I guess it was just a proposal.

  • No, I did not go through the still-unbelievable requirement of manually checking if this function was documented/exported/in-the-manual. I have not considered it then because even now that I know of it, this requirement for manual check, for a modern programming language, in 2022, is so ridiculous and absurd that it leaves me speechless.

  • Moving on to v1.8, it appears that not only has modules_warned_for been removed, causing the breaking change, but the user code loading strategy has changed and the user code loading that used to warn out, now doesn’t work at all.

  • So it’s not specifically about modules_warned_for nor about Genie as I’ve actually solved the code problem in the soon to be released Genie v5. I took this opportunity to completely refactor the user code loading, move many modules into out-of-core plugins, get rid of many compromises done over so many Julia versions, etc, etc so it’s a nice cleanup and I’m happy with the outcome that this change has forced.

  • So it’s a discussion about principles and methodology with a few related topics:

1/ having a way to programatically check if a function/property is private or public at module and package level (by failing tests/not compiling)

2/ raising the point that, IMO, the proposed solution of manually checking the manual/documentation/export status of every function and property is a ridiculous proposal. (I know “ridiculous” is a strong word but I can’t think of anything else: a programming language is supposed to automate tasks, how can it have a policy to create a requirement for a human to manually check every function and property?!! My mind can not comprehend.)

3/ the fact that, to my experience, modules_warned_for is not the only breaking change introduced - as per my experience, code loading strategies have also changed, leading to breaking changes. So overall, much more consideration should be given to changes.

4 Likes

I think you’re overblowing the issue. It’s not totally irrational after your package broke, like how people can become wary of stoves after a burn. But people just aren’t routinely stumbling into variables and then checking the documentation every time. It’s the other way around: people read documentation to learn how to use a package. Even on forum discussions it’s really rare for people to suggest anything that’s not documented, and when that happens, someone usually points out “btw that’s internal so don’t make this part of your package or anything”. You were just unlucky to get a suggestion that didn’t get checked. I guess the lesson here is “learn usage from the documentation,” and on the super rare occasion someone suggests a name you’ve never read about in a package you were already familiar with, search the documentation again to either learn about it in the usual manner or find out it’s not public API you could rely on.

3 Likes

I disagree — I think it is a servicable solution (Incidentally, repeatedly resorting to hyperbole does not really advance your argument; quite the opposite). I will elaborate on this below.

First, how you find symbols (functions, variables, macros) to use from a package matters. Personally, I usually read the docs first, so anything I end up using is part of the API. Symbols don’t just show up out of the blue (for PRs, I check the docs if I am unsure).

Second, most packages follow the convention that the API is exported (there are of course exceptions, like ForwardDiff, but not many). You really have to make an effort to use an unexported symbol, it is quite hard to do accidentally.

Third, the thanks to multiple dispatch, the vocabulary of most Julia packages is quite small.

2 Likes

@Benny @Tamas_Papp of course it’s an exaggeration, I took the proposal to its logical extreme to make a point (like BigO, thinking of the worst case scenario). We ourselves we’ll use APIs that we learned and understand, it’s not like we have to double check every function that we type every time we type.

But there is a common scenario where you handle code that uses/imports from other modules and packages, written by other people, where this becomes a problem and where indeed there is no way to programatically verify public/private access.

The fact that non-exported code can be imported and that many people prefer full qualified names even for exported APIs, make this process of discovery of access even more complicated. So indeed, in such scenarios, the only way to do it is for a human to manually check each and every usage.

This compounds and can be a major blocker for developing large applications in large teams (including Open Source).

Possibly, but I would wait for them to complain about it then. My understanding is that your particular issue wasn’t an instance of this (it was code you yourself introduced, no large team involved, and you must have been fully aware of using an internal symbol), so I consider it premature to extrapolate to other scenarios.

I think it is best if language development is shaped by actual problems and use cases, instead of speculation.

2 Likes

The solution was offered by a contributor. Also @Tamas_Papp I have yet to hear from you a technical argument against access modifiers.

1 Like

I mostly agree with @Benny’s comment above about access modifiers.

Generally, I am not in favor of the “I made a mistake, so let’s redesign the language so that it doesn’t happen again” approach, as I am resigned to the fact that I make mistakes all the time. Nevertheless, I don’t know enough about the details of your proposal to evaluate it seriously. Perhaps make a PR.

2 Likes

But that function is not supposed to be used by any other packages outside Base. Genie.jl should take care of it when it has been removed or its behavior changes.

1 Like

I don’t think that’s correct - LOAD_PATH was always meant for loading packages & projects, which need to have versioning for them, not just some code that happens to be there. Requiring those to be added to the current environment so Pkg can actually do dependency resolution & track whether the code there changed seems like a very good requirement to have. That’s what Pkg is there for after all - dependency management.

The warning got added precisely because people were misusing LOAD_PATH and similar for purposes that are outside of their intended use, i.e. not adhering to the API provided. As a warning it didn’t break anything, it’s not an error after all. Its only purpose is to notify you as a developer of a problem with the way you’re using something. That you chose to silence that warning via an internal that resulted ultimately in a true broken behavior for your package when it got removed, instead of addressing the core issue the warning was telling you about, is unfortunate, but really outside of the control of julia developers.

That’s great to hear! I wish more people would take the time to refactor their code bases after a few versions/LTS rounds. There are often a number of cludges we as developers tend to accumulate over the lifetime of a project, which we can only remedy once we try to reorient ourselves after a newer API.

I really want to reiterate that I do agree that something here would be better than the nothing we have now. We just don’t have anything of the sort right now, and I also want to make clear that I don’t think trying to prevent people from accessing some function entirely on the compiler level is the way to go for this. The nature of julia being a dynamic language and the desire for allowing people to reach into code they didn’t write and intentionally use it differently from the way the original creator intended is in direct opposition to saying “no, you can’t do that” on the compiler level. So I’d be very much in favor of this warning, with the option to silence the warning on a per-internal-object level, as long as that comes with the understanding that using code in such a way may break at any time and does not fall under any stability guarantees through semver. If it breaks, it’s on you.

4 Likes

@kwdef is a good example - it is widely used and often recommended, so one wouldn’t normally check whether it is NOT in the documentation. But one would probably read the help/docstrings before use.

Wouldn’t it be possible to add some keyword to doctrings of those function which ARE part of the API, or at least to the few which are in the API but not exported? E.g.isAPI.

Then:

  • if it is exported, it is API
  • it it is marked with isAPI, it is API
  • otherwise not

Such a check could be built in into a linter, and would only require a limited refactoring of extant code.