Did Julia community do something to improve its correctness?

For reference to this thread, the issue was that the proposed methods for QMC sampling used in global sensitivity analysis proposed by Saltelli (https://www.sciencedirect.com/science/article/abs/pii/S0010465509003087) turn out to not necessarily be convergent ways of sampling a QMC process. We had originally implemented this based on examples of other sensitivity analysis packages, such as R’s sensitivity (https://cran.r-project.org/web/packages/sensitivity/sensitivity.pdf) and Python’s SALib (SALib - Sensitivity Analysis Library in Python — SALib's documentation) which reference this same paper as effectively the modern way to do Sobol sensitivity analysis. However, community members correctly pointed out this convergence issue, with a nice analysis seen in update Documentation, readme and docstrings by dmetivie · Pull Request #79 · SciML/QuasiMonteCarlo.jl · GitHub, so we have since corrected the QuasiMonteCarlo.jl package to not use Saltelli’s method and updated GlobalSensitivity.jl around this as well (thank you for your contributions). This is our example of the Julia community not caring about improving correctness.

13 Likes

That’s a different issue (and one David Metivier, not me, gets huge props for fully understanding and fixing). I’m talking about how up until the fall of 2022, several functions in QMC.jl were just flat-out wrong, returning incorrect results. Moreover, the package did not include even a single test of correctness until then, relying exclusively on clearly-insufficient tests like verifying that the returned values were a Matrix{Float64} with correct dimensions. The lattice samplers were nondeterministic, and several functions returned out-of-bounds values.

I don’t bring this up because it’s representative of the Julia community; it was an utter nightmare of a package (put together in a rushed GSOC project with insufficient supervision). I have never encountered a package that bad since, except maybe a few dusty corners of Distributions.jl written back in 2014. But the fact that it happened at all in a SciML package tells me we need to raise our standards.

4 Likes

That’s good to know! I’d love to learn more about the approaches being tested, do you have some links on hand?

2 Likes

Well, here’s one:

It seems the plan is for it to be an explicitly defined and tested interface with Interfaces.jl

4 Likes

Ah, I see - Interfaces.jl tries to be accommodating by allowing optionality in an interface; I don’t think that’s a good idea, because it makes requiring that “optional” part of an interface really difficult from the perspective of a consumer of that interface. That’s why I don’t have/allow that in RequiredInterfaces.jl. This is part of the issue when it comes to the informal interfaces of Base.

4 Likes

because it makes requiring that “optional” part of an interface really difficult from the perspective of a consumer of that interface

It doesn’t. Interfaces.jl provides compile-time traits for all the individual options, consumers can just check what parts are implemented. That’s one of the central design goals of the package.

Most reasonably sized interfaces have optional components, there is no way around designing for it.

5 Likes

My point is that no interface should ever have the notion of “optionality”; as soon as you’re talking about that, you have a larger interface that ought to have its own name, as well as being able to restrict dispatch on its own. Put another informal way, a smaller interface (i.e., without the optional parts) is a subtype of the larger interface (with the optional parts). Having optionality be expressed as a proper subtyping relationship means you can suddenly make use of a large body of type theory for static analysis, and it can (theoretically, at least) slot right into the existing type system. That’s much more difficult to do if it’s all part of one “interface type with optional parts”, because that more or less requires whole-program analysis to decide which parts of the interface are actually used (avoiding that, to me at least, is part of the problem this topic set out to discuss).

I’d say most reasonably sized interfaces have some “layers” of guarantees & requirements; but that’s not the same as saying they have optionality. It’s perfectly feasibly (and, in my opinion, preferable) to split such an “optional interface” up into individual parts that can be combined on their own.

7 Likes

Optionality is typically best expressed in terms of a derived, sub-interface. I.e. You could imagine that we have ArrayInterface and, then you could have MutableArrayInterface <: ArrayInterface, and ResizeableArrayInterface <: MutableArrayInteface, etc.

If something satisfied ResizeableArrayInteface then it automatically has to satsify ArrayInterface, but the converse is not true.

Typeclasses are the typical way things like this are formalized and expressed in other languages.

5 Likes

Without multiple inheritance this ideal of layered interface is harder to achieve. That’s why I wanna go with Interfaces.jl for the Graphs.jl formalization.
Here are some example of overlapping but not nested graph families:

  • weighted vs unweighted
  • directed vs undirected
  • mutable vs immutable
2 Likes

Inheritance like that would be nice in some ways. The reason Interfaces.jl works like it does is In practice interfaces in julia are heterogeneous and patchy in a pretty fine-grained way.

See: Document AbstractSet interface · Issue #34677 · JuliaLang/julia · GitHub

You’re recommending defining a lot of interfaces to say something is a Set

The question is how does a new Set type opt into the parts that it implements in a clean, simple way. I’m not at all sure Interfaces.jl has the best way, but you can do that with one line of code.

1 Like

You don’t need to use actual type inheritance for this relation though, that was just to illustrate the point. The sub-interfacing can be expressed in terms of traits quite easily, and give a very useful way of expressing add-ons to the interface.

If your graph interface treats “weighted vs unweighted”, “directed vs undirected”, and “mutable vs immutable” all using the same level of optionality that Interfaces.jl provides, that sounds like a very un-expressive interface solution that’d ultimately be kinda frustrating to use.

2 Likes

I should also mention that inherited sub-interfaces are really the only way for a third party to express that they made something that’s related to, but a restricted-subset of the interface that you’re providing.

2 Likes

Sub-interfaces will often inherit from multiple optional parts of an interface.

That’s just a further reason why you’d want to express sub-interfacing via traits.

E.g. something like (don’t take this code too seriously, it’s just illustrative)

@interface GraphInterface

@subinterface WeightedGraphInterface <: GraphInterface
@subinterface UnweightedGraphInterface <: GraphInterface

@subinterface DirectedGraphInterface <: GraphInterface
@subinterface UndirectedGraphInterface <: GraphInterface

@subinterface MutableGraphInterface <: GraphInterface
@subinterface ImmutableGraphInterface <: GraphInterface

and then some person would write

struct MyGraphType
    [...]
end
@satsisfies MyGraphType (MutableGraphInterface, DirectedGraphInterface)
1 Like

But I think that’s the misunderstanding: Interfaces.jl does provide traits, including for subinterfaces with arbitrary combinations of optional components. It is just a wrapper to define and test these traits in a more user-friendly way

1 Like

This is virtually equivalent to what Interfaces.jl allows, I’ll let @Raf do the translation in the package syntax but it looks almost as nice

1 Like

Ah good, I just didn’t see anything about subinterfaces in the documentation, and that together with the fact that is supported optional stuff made it seem like it wouldn’t support this.

I’m not really sure why you’d have optional methods if you support arbitrary sub-interfacing.

2 Likes

It doesn’t support sub-interfacing exactly like that. The optional components aren’t inheritable, which is something I’ve been thinking about the last few weeks.

I think it can be done fairly easily by swapping out the Symbols currently used for abstract types like you define, that inherit from the parent interface.

The macro could write all the boilerplate so it stays a one-liner like it is currently. It used Symbol instead of abstract types initially just to keep things as simple as possible.

3 Likes

I think things like that would be especially important if an ecosystem grew up around one of these things with an interface. i.e., if we look at Number or AbstractArray, there’s a lot of packages out there that define subtypes of these that do special things, but there’s also packages out there that define abstract subtypes of them that other packages are then expected to further subtype.

These subtypes need to express different levels of sub-functionality. i.e. it’d be kinda bad IMO if someone had to submit a PR to change the inteface for AbstractArray in Base to add an optional flag for expressing some niche concept that their ecosystem is concerned with, like say geospatial data or something.

Instead, they’d ideally be able to just create a subinterface or whatever in in their package that inherits all the requirements of AbstractArray but also has further requirements for their use-case.

3 Likes

That’s a very good point for modular, inheritable interfaces.