Optional dependencies / Requires.jl


#1

I recently started thinking about conditional modules, while working on making sure Requires.jl works on 0.6 and I thought it would be helpful to recap discussions I had with @jameson and @tkelman. This follows in similar veins to https://github.com/JuliaLang/julia/issues/6195 and https://github.com/JuliaLang/julia/issues/15705#issuecomment-254264419

Status quo: Requires.jl

In Requires.jl you can defer the loading of a piece of code with the @require macro.

using Requires

@require DataFrames begin
  println("DataFrames loaded")
end

println("Before using")
using DataFrames
println("After using")

In order to provide this feature (which is quite heavily used to provide some notion of optional dependencies), Requires.jl currently has to overwrite Base.require which is frowned upon and requires jumping through some major hoops in order to call the original version of Base.require that does all the heavy lifting. The dynamic nature of this also makes it impossible to precompile these codeblocks.

First alternative approach: Keeping it dynamic

The minimal change to Requires.jl would be to extend Base.require to notify through a callback that a module was finished loading. I experimented with this a little bit in https://github.com/JuliaLang/julia/tree/vc/loading_callbacks and https://github.com/vchuravy/Requires.jl/tree/vc/patchedbase, but this still runs into the problem that it is not precompile friendly and so if an optional dependency is later installed we won’t invalidate the cache file.

Second alternative approach: Static and precompile friendly

The second approach that is a lot more involved in terms of required base changes is to change Requires.jl dynamic delayed loading to a more straight-forward, check if module is loadable, if yes compile code, if not don’t compile code, but record that dependency and invalidate the cache if that dependency gets installed.

@require mod expr will evaluate to something akin to:

if Base.find_in_node($(String(mod)), nothing, 1) !== nothing
   return expr
else
  return quote
      register_optional_dependency(mod)
  end
end

The big question mark for this approach is register_optional_dependency. I started experimenting a little bit in that direction, but currently that code crashes and burns when precompiling.

mostly because I tried to use uuid == 0 to mean optional in the cache file module list.

I would be keen on hearing thoughts from others.


#2

Haven’t looked at the patches yet, but re:

invalidate the cache if that dependency gets installed

why would it need to be at install time of the dependency rather than at load time for the package that has conditional dependencies? If we store a separate class of dependency in the .ji file for conditional dependencies, and record at precompile time which of them were or were not available at that time, we can check when that same .ji file is next loaded, whether or not the available set has changed. If you install some conditional dependency, but then remove it before you next try to load the package that conditionally used it, the original cache would still be valid, right?


#3

Yes absolutely. The cache would be invalidate once the package is loaded again.


#4

The version with the static approach is now usable at:

Please note that it is still prototype functionality and the code will need to be cleaned up. To test it create a package that either calls jl_module_register_optional or that contains:

module TestRequires

__precompile__()

using Requires

@require WeakRefStrings begin
  f(x) = "Here!, Here!"
end

# package code goes here

end # module
What works?
  • Precompiling the package, without the optional dependency installed
  • Installing the optional dependency will invalidate the cache, when the package is next loaded.
What doesn’t work?
  • Removing the optional dependency, will throw:
julia> using TestRequires
ERROR: ArgumentError: Module WeakRefStrings not found in current path.
Run `Pkg.add("WeakRefStrings")` to install the WeakRefStrings package.

Instead it should probably invalidate the cache again.

Please give it a try and poke wholes into the design.


#5

FWIW, I haven’t found this to be a problem in practice, and I don’t know of any package where being able to precompile these blocks would give a significant speedup; @requires blocks are generally only a few lines of header-like code.

Being able to auto-import packages is a nice feature of the static approach which would be useful for optional backends. It wouldn’t be able to handle all cases that the dynamic approach can; for example, if you load a package then install and load a conditional dependency of that package without restarting. I think the only full solution is conditional sub-modules that are separately precompiled and handled specially by the package system.

I would personally favour the support for require hooks in base as a simple short-term solution, and then figure out something better. (I’m not sure why you think this would cause an issue with cache files though; the require blocks aren’t cached so there’s nothing to change if the set of available deps changes.)


#6

Plots.jl. It lazy loads every backend. That would likely be helped a lot by some kind of precompilation. It doesn’t use Requires.jl though: just straight @eval


#7

I think what I call static and dynamic approach are orthogonal to each other, sometimes you will want to do something in reaction to a package being loaded and other times you will want to use precompilation in order to define conditional modules.

Why do you think that dynamic conditional sub-modules are necessary? Purely for the user experience of not having to restart? One could probably do something akin to.

if Base.find_in_node(String(mod), nothing, 1) !== nothing
   return expr
else
  return quote
      register_optional_dependency(mod)
      # dynamic code loading, in case module becomes available later
      end
      listenmod(mod) do
         eval(expr) 
      end
  end
end

to get the effect you were thinking about. Also none of these are changes that have to go into Requires.jl :slight_smile: I just used that as a starting point to understand the design space better.


#8

I don’t think the dynamic approach is necessary or better as such; just chiming with what you’ve said, which is that they are independent features with slightly different tradeoffs. Your static proposal is a good one and may well be better overall. In principle they could both co-exist, although I suspect the overlap is enough that it’s better to pick one.

I guess I should revise my preference to “whatever’s easiest”. Adding a hook to require is trivial to implement compared to having a cache dependency on a non-existent file, but then bikeshedding with Base is part of the implementation cost too :slight_smile:


#9

I haven’t looked at Plots in detail, but most of the time Julia’s late binding means that you can write a bulk of the heavy lifting code without the library you’re using being loaded. The main exception is adding new methods, for which you obviously need the function and any types to be available. But then you can use requires just for that one definition that ties together the two packages, i.e.

function real_foo(x)
  Foo.bar(x)
end

@require Foo begin
  using Foo
  Foo.foo(x::MyType) = real_foo(x)
end

In Plots’ case, you could probably get almost all of the benefits of precompilation like this:

foo_code = parse("foo.jl")
load_foo() = eval(foo_code)
# instead of include("foo_code.jl") directly

#10

How do we express version requirements for optional dependencies?


#11

… like for required dependencies (?). Should this be different from your point of view?


#12

I’m not sure, but doesn’t that mean all the optional dependencies are actually required, from the package manager’s point of view?


#13

I have the feeling this was discussed before, but stil…

Required dependencies enable the package to do something, without the dependencies the package cannot do anything. Optional dependencies enable the package to do some more.

A plotting package might require a package providing colormaps. If more colormap packages are installed, you can use them additionally.


#14

IIUC, the only way we have to express version requirements is the REQUIRE file. But that can only express required dependencies. Do we need to add optional dependency support to REQUIRE?


#15

Has there been any discussion of expanding the data available to the package manager? Pkg.add("Pkgname") doesn’t allow the user to provide more information about how to build the package.

For a point of comparision, the C/C++ libraries I use (particularly PETSc) have lots of configuration options available, of the form configure --with-feature-x --without-dependency-y.


#16

#17

Didn’t i read

...
[package.Required]
uuid = "85241492-0f92-400a-8719-bdc0424991f7"
versions = ["1.2-1.3", "!1.2.5"]

[package.Optional]
...

in the Pkg3 description? Adding this right now to REQUIRE is too much of a workaround (hack?)


#18

Would need a way of expressing “if installed, must satisfy these version constraints.” If the version constraints aren’t satisfiable, would the alternative be disallowing the optional package from being installed at all?

Or we could include the version preconditions in the “if installed [and appropriate version], enable this block of code” check.


#19

I’d vote for the second.


#20

I think this is a very important point.

Wouldn’t Plots be much more naturally be organized as follows

using Plots #this only has abstract stubs 
using PlotsPyPlot #Pulls in the implementation of Plots