Base.modules_warned_for silently removed in 1.8

It does seem like an odd internal to depend on. It was there as a state tracker to avoid warning about broken package resolves multiple times in some code that was put in place to ween users onto routinely using environments. It didn’t actually work in a lot of cases due to precompilation warnings happening in nested processes, thus the warning state didn’t propagate.

I’m genuinely interested to see why it’s needed, but perhaps that discussion could move to an issue on the package it’s needed in. Feel free to tag me, given it was my PR that removed it.

6 Likes

Long story short:

  • Genie apps have 2 main components - the Genie package and the user app (a module with multiple submodules). These 2 parts cooperate, the user app using features exposed by Genie, and Genie providing APIs to manage user app files.

  • since work on Genie started in Julia 0.5, before Pkg, it was affected by multiple breaking changes in Julia’s code loading system.

  • after the introduction of Pkg, LOAD_PATH was used to load user code/modules by adding their folders to LOAD_PATH and this worked well for many years and for many Julia versions.

  • I think around v1.6 the user code loading started to throw a warning (why would it start to warn with a minor release, no idea) and a user suggested Base.modules_warned_for in order to get rid of the warning.
    (An attempt to figure out what changed here Module correctly loaded from LOAD_PATH but still getting a dependencies warning)

  • That made sense so I introduced the proposed patch - tests passed, and the code worked as previously so no big deal with a warning I thought.

  • Yes, I personally introduced the patch as some people here uselessly spent time to dig - sorry, I didn’t mean to mislead, I remembered it was a PR but I guess it was just a proposal.

  • No, I did not go through the still-unbelievable requirement of manually checking if this function was documented/exported/in-the-manual. I have not considered it then because even now that I know of it, this requirement for manual check, for a modern programming language, in 2022, is so ridiculous and absurd that it leaves me speechless.

  • Moving on to v1.8, it appears that not only has modules_warned_for been removed, causing the breaking change, but the user code loading strategy has changed and the user code loading that used to warn out, now doesn’t work at all.

  • So it’s not specifically about modules_warned_for nor about Genie as I’ve actually solved the code problem in the soon to be released Genie v5. I took this opportunity to completely refactor the user code loading, move many modules into out-of-core plugins, get rid of many compromises done over so many Julia versions, etc, etc so it’s a nice cleanup and I’m happy with the outcome that this change has forced.

  • So it’s a discussion about principles and methodology with a few related topics:

1/ having a way to programatically check if a function/property is private or public at module and package level (by failing tests/not compiling)

2/ raising the point that, IMO, the proposed solution of manually checking the manual/documentation/export status of every function and property is a ridiculous proposal. (I know “ridiculous” is a strong word but I can’t think of anything else: a programming language is supposed to automate tasks, how can it have a policy to create a requirement for a human to manually check every function and property?!! My mind can not comprehend.)

3/ the fact that, to my experience, modules_warned_for is not the only breaking change introduced - as per my experience, code loading strategies have also changed, leading to breaking changes. So overall, much more consideration should be given to changes.

4 Likes

I think you’re overblowing the issue. It’s not totally irrational after your package broke, like how people can become wary of stoves after a burn. But people just aren’t routinely stumbling into variables and then checking the documentation every time. It’s the other way around: people read documentation to learn how to use a package. Even on forum discussions it’s really rare for people to suggest anything that’s not documented, and when that happens, someone usually points out “btw that’s internal so don’t make this part of your package or anything”. You were just unlucky to get a suggestion that didn’t get checked. I guess the lesson here is “learn usage from the documentation,” and on the super rare occasion someone suggests a name you’ve never read about in a package you were already familiar with, search the documentation again to either learn about it in the usual manner or find out it’s not public API you could rely on.

3 Likes

I disagree — I think it is a servicable solution (Incidentally, repeatedly resorting to hyperbole does not really advance your argument; quite the opposite). I will elaborate on this below.

First, how you find symbols (functions, variables, macros) to use from a package matters. Personally, I usually read the docs first, so anything I end up using is part of the API. Symbols don’t just show up out of the blue (for PRs, I check the docs if I am unsure).

Second, most packages follow the convention that the API is exported (there are of course exceptions, like ForwardDiff, but not many). You really have to make an effort to use an unexported symbol, it is quite hard to do accidentally.

Third, the thanks to multiple dispatch, the vocabulary of most Julia packages is quite small.

2 Likes

@Benny @Tamas_Papp of course it’s an exaggeration, I took the proposal to its logical extreme to make a point (like BigO, thinking of the worst case scenario). We ourselves we’ll use APIs that we learned and understand, it’s not like we have to double check every function that we type every time we type.

But there is a common scenario where you handle code that uses/imports from other modules and packages, written by other people, where this becomes a problem and where indeed there is no way to programatically verify public/private access.

The fact that non-exported code can be imported and that many people prefer full qualified names even for exported APIs, make this process of discovery of access even more complicated. So indeed, in such scenarios, the only way to do it is for a human to manually check each and every usage.

This compounds and can be a major blocker for developing large applications in large teams (including Open Source).

Possibly, but I would wait for them to complain about it then. My understanding is that your particular issue wasn’t an instance of this (it was code you yourself introduced, no large team involved, and you must have been fully aware of using an internal symbol), so I consider it premature to extrapolate to other scenarios.

I think it is best if language development is shaped by actual problems and use cases, instead of speculation.

2 Likes

The solution was offered by a contributor. Also @Tamas_Papp I have yet to hear from you a technical argument against access modifiers.

1 Like

I mostly agree with @Benny’s comment above about access modifiers.

Generally, I am not in favor of the “I made a mistake, so let’s redesign the language so that it doesn’t happen again” approach, as I am resigned to the fact that I make mistakes all the time. Nevertheless, I don’t know enough about the details of your proposal to evaluate it seriously. Perhaps make a PR.

2 Likes

But that function is not supposed to be used by any other packages outside Base. Genie.jl should take care of it when it has been removed or its behavior changes.

1 Like

I don’t think that’s correct - LOAD_PATH was always meant for loading packages & projects, which need to have versioning for them, not just some code that happens to be there. Requiring those to be added to the current environment so Pkg can actually do dependency resolution & track whether the code there changed seems like a very good requirement to have. That’s what Pkg is there for after all - dependency management.

The warning got added precisely because people were misusing LOAD_PATH and similar for purposes that are outside of their intended use, i.e. not adhering to the API provided. As a warning it didn’t break anything, it’s not an error after all. Its only purpose is to notify you as a developer of a problem with the way you’re using something. That you chose to silence that warning via an internal that resulted ultimately in a true broken behavior for your package when it got removed, instead of addressing the core issue the warning was telling you about, is unfortunate, but really outside of the control of julia developers.

That’s great to hear! I wish more people would take the time to refactor their code bases after a few versions/LTS rounds. There are often a number of cludges we as developers tend to accumulate over the lifetime of a project, which we can only remedy once we try to reorient ourselves after a newer API.

I really want to reiterate that I do agree that something here would be better than the nothing we have now. We just don’t have anything of the sort right now, and I also want to make clear that I don’t think trying to prevent people from accessing some function entirely on the compiler level is the way to go for this. The nature of julia being a dynamic language and the desire for allowing people to reach into code they didn’t write and intentionally use it differently from the way the original creator intended is in direct opposition to saying “no, you can’t do that” on the compiler level. So I’d be very much in favor of this warning, with the option to silence the warning on a per-internal-object level, as long as that comes with the understanding that using code in such a way may break at any time and does not fall under any stability guarantees through semver. If it breaks, it’s on you.

5 Likes

@kwdef is a good example - it is widely used and often recommended, so one wouldn’t normally check whether it is NOT in the documentation. But one would probably read the help/docstrings before use.

Wouldn’t it be possible to add some keyword to doctrings of those function which ARE part of the API, or at least to the few which are in the API but not exported? E.g.isAPI.

Then:

  • if it is exported, it is API
  • it it is marked with isAPI, it is API
  • otherwise not

Such a check could be built in into a linter, and would only require a limited refactoring of extant code.

Here’s a really simple tool to help with this problem: GitHub - cjdoris/ModuleAccessChecker.jl: Wraps modules to ensure only API is accessed

You get started like this:

julia> import ModuleAccessChecker

julia> ModuleAccessChecker.wrap(@__MODULE__, Base)

julia> XBase.Vector
Vector (alias for Array{T, 1} where T)

julia> XBase.modules_warned_for
ERROR: modules_warned_for is not part of the API
Stacktrace:
[...]

So it creates a wrapped version of the given module. Any accesses which are not API throw errors.

By default, anything exported is considered to be API. So if you access anything not exported, you’ll get an error.

Cool, now if you get an error and you’d actually like to use that property, just add it to the whitelist:

julia> import ModuleAccessChecker

julia> ModuleAccessChecker.wrap(@__MODULE__, Base, whitelist=[:modules_warned_for])

julia> XBase.modules_warned_for
Set{Base.PkgId}()

So assuming you (a) replace all Base.foo accesses with XBase.foo in your codebase and (b) test your code, then you will quickly discover any places you are using non-exported functionality, and you can add this to the whitelist.

Granted it doesn’t automagically discover the unexported API of a module for you, but most packages have very few unexported things in its API, so this is hardly a burden.

And if you use this mechanism to rely on undocumented internals, at least the names of these are recorded in a single place, namely the whitelist argument to the wrap function above.

10 Likes

Yes, “ridiculous” is the mot juste. And I think this is a widely-shared opinion (I’d like to know how widely, but I’ve heard it expressed privately by highly experienced rational coders). I could understand this attitude: this is what we have now and it is working sort of ok; there’s a lot of important stuff to do, but we need to add designing a better solution to the list of important things. But, “there’s no problem here, you are overreacting” is the kind of thing that drives people away from Julia.

Big projects like JuMP already do try to impose order on private/public things (among other wild west features). Of course they do; big successful projects either rely on language features or work out their own system. What these projects can’t do is impose this order on Base and stdlibs.

@essenciary is very clearly not just hot under the collar because of difficulties with one project. He is trying to generalize the issue for the good of the language and all of its users.

7 Likes

Feel like this is just the very visible exception, I can’t think of any other internal that turned into a community darling that has its own issue about being made public. Its docstring even has examples. I bet even if they do figure out a better official alternative, they’ll just update the docstring and keep it around a long time for backwards compatibility.

Very quick work! Am I right in assuming this is intended for the “public” modules, like Base or a package-named module? Because such modules could be using internals exported by internal submodules, so it seems like a bad idea to wrap those submodules.

I believe it is now you that are walking in circles. People of this thread have already showed no interest in having, and you dropped the issue previously, only to use it as argument when a solution do not also solve the enforcement of private fields (which no solution that is not considering it will, and it is not being considered because we agreed to let this go before). Again, I find anti-Julian to have this “feature” (enforced private fields), if it existed and was used it would have hindered my PhD so much that I would probably not be using Julia anymore. So I am willing to put some effort to hinder any possibilities of this being adopted by Julia.

I have wrote the state-of-the-art code for my PhD this way, and I cannot understand how you have this ease of discoverability for undocumented/unexported things. If the binding has no documentation how do you know what it is supposed to do, to be sure you are using it correctly? I am only aware of things that are documented and public, so if I remember something then it is documented and public, how I would be aware of the existence private functions and not of the fact they are private.

This is no argument against a programming check, but it seems like what you find “ridiculous” comes from a specificity of your workflow. This characteristic has never bothered me.

4 Likes

People intentionally basing code on unstable internals and being aware they need a better solution in the near future is quite different from OP’s case where he wasn’t aware and was blindsided after 2 minor revisions of Julia.
I don’t think it’s entirely fair to say that people are claiming there is “no problem here”, a package did blow up. It’s just that when it comes to determining what are stable features, the conventions around usage documentation and exports do work, and the proposed language designs to avoid those conventions were just fundamentally incompatible with the intended purpose, on top of being major revision features. There are also some feasible less-breaking proposals in this thread.
If someone really can’t work around relying on a library’s unstable internals, it seems only fair that they help develop that library’s public API for the next minor revision so people can rely on something stable. Of course, version changes happen more slowly for bigger projects with more moving parts, which incidentally is a good argument against putting more modules into base Julia to emulate something like SciPy.

5 Likes

I one package of mine I used the pattern of documenting the API functions with the standard help entry and the non-API functions with the “extended help”, while the standard help only prints a warning. IMHO if that was the default behavior for exported vs. non-exported (and perhaps a simple macro @api to mark non-exported APIs), the whole issue would be very much reduced.

Example: Help entries · CellListMap.jl

5 Likes

That’s a nice convention with the # Extended Help - I can’t believe I haven’t thought of this before!

But yeah, an explicit @api to mark something as definitely 100% intended as public API would be nice. Makes it easy to see when you’re contributing with a PR whether something is API and makes sure you’re not accidentally making something public that shouldn’t be.

3 Likes

What became of GitHub - JuliaExperiments/PublicAPI.jl: PublicAPI.jl provides a simple API for declaring API without exporting the names? Last I heard people were going to experiment with it, but AFAICT only one package has?

7 Likes

This won’t be a major breaking change like package-level access modifiers, but it still has to contend with multimethods that may be scattered across packages. If one method is marked @api, is the whole function public API? If so, do we allow people, who may not be the original developer, to mark a method outside the function’s parentmodule? If a function can have both @api and internal methods (for the record, I’m already against this), how could a caller method limit itself to only dispatching to the @api methods? This applies to macros too because they multiply dispatch on number of arguments, and there’s the additional difficulty of macro names not being accessible variables (so you can’t do something like ispublic(@eval) ). It’s good enough for types at least, we don’t have multistructs, just parametric ones. Not sure how to make it work for global variables, maybe a module-wise registry of names is needed.

3 Likes