What interface bugs have you ran into in the ecosystem?

I’m currently working on the formal design of a statically-checked trait system for Julia building off of our work on subtyping and static typing for Julia. The system as a whole is sadly some ways out yet because of a number of fundamental challenges. This problem isn’t unique to Julia at all, and interestingly is particularly bad in C++.

Before I get there, however, I need some examples of bugs or problems in your experience that have arisen because of a lack of checked interfaces. I have a few “classic” examples, but these issues tend to be ephemeral and momentarily annoying when you try and fail to compose libraries. It’s hard to discover them with static analysis (or else we wouldn’t be here!) and the known ones tend to be buried in Git somewhere.

My go-to example is the 1:length antipattern as discussed here. It’s pretty common to use length on AbstractArrays even though it’s not valid in many cases - and this can go unnoticed until you pass something like a StridedArray where you can’t just go 1:length to index through it. This would be a faulty use case, a case where you’re trying to use a method that doesn’t exist for an argument vector.

I’m also interested in implementation-side bugs, where you should have implemented some method but forgot about it. Back in the 0.6.0 days, a few of the Core.Compiler implementations of AbstractArray had this by lacking methods called for in the AbstractArray documentation (iirc, size). This has been fixed by now, but I’d be very interested in other examples.

Have you ran into bugs caused by insufficient interface specification, incomplete implementation, or incorrect usage?

13 Likes

Do you also look for examples where e.g. Holy Traits are insufficient/not desirable for a hookable-by-users interface? I often run into that problem when using abstract types to delineate similar-but-subtly-distinct behaviors under a common abstract supertype, which would be much more cleanly solved with multiple abstract subtyping (I’m fairly certain you already know about this, but I’ve written about that problem here).

So to me, it’s very important that these interface specifications (whatever form they may take) compose well, in the sense that I can say “my struct Foo should dispatch both like Bar and like Baz, and I’m fine with having to clean up any ambiguities that arise because of that” (assuming you want your interface specs be dispatchable, of course :smiley: ).

I don’t have a list, but the number of things that don’t forward missing has got to be humongous. Adding the “correct” way to support that is always a sticking point whenever I make a PR to make some non-missing thing easier to work with (case in point, [1] and [2]).

I personally don’t use missing, so having that hold back (to me) useful PRs is very annoying.

On the upside, I have some (for now, private) work that could make it much easier to find these kinds of mismatches between what is expected & what is required, provided a human can code up a “this is what I think the method expects” version of some docstring.

4 Likes

I’m not exactly sure what kind of issues you’re looking for but the dates and recently created correctness bug labels on github might help.

Also aliasing issues like sum!, prod!, any!, and all! may silently return incorrect results · Issue #39385 · JuliaLang/julia · GitHub and

various edge cases like `findprev`/`findlast` with empty search string · Issue #39940 · JuliaLang/julia · GitHub

1 Like

I’m not exactly sure what kind of issues you’re looking for but the dates and recently created correctness bug labels on github might help.

Good pointers - though unfortunately there’s not a lot of examples there. Most of these interface bugs occur when you’re composing codebases that are not used heavily individually or when combined. Another archetypical example is the math ecosystem, which relies heavily on “plus should work like plus” with no consensus on what that means exactly, as illustrated quite a bit by ForwardDiff.

various edge cases like findprev/findlast with empty search string · Issue #39940 · JuliaLang/julia · GitHub

Unfortunately (for me), most of these are not interface bugs per se; they are problems, but not ones caused by a misused, under-specified, or unimplemented interface. You can’t (in most normal languages) write a type for a nonempty string, for example so consequently it’s hard to write an interface specifically for said strings.

Also aliasing issues like sum!, prod!, any!, and all! may silently return incorrect results · Issue #39385 · JuliaLang/julia · GitHub and

This is an interesting one; catching it at the level of a type/trait system would require a notion of ownership a la Rust. You could enforce something similar in Julia by very careful use of isbits types but it’s tricky and would be a very restrictive API.

1 Like

In other contexts an interface can be a more expansive concept, but it seems like your use of the term “interface” means something like “type signature, including type classes but not dependent types” - is that right?

1 Like

Do you also look for examples where e.g. Holy Traits are insufficient/not desirable for a hookable-by-users interface? I often run into that problem when using abstract types to delineate similar-but-subtly-distinct behaviors under a common abstract supertype, which would be much more cleanly solved with multiple abstract subtyping (I’m fairly certain you already know about this, but I’ve written about that problem here).

I have! Your package is very nice, tangentially. The main challenge here is that it’s hard to find design limits post facto, where a library implementer made a choice because they couldn’t do what they really wanted to do. It’s super annoying and super apparent in the moment but tricky to reverse engineer (particularly automatically.

Most of what I can point to specifically are examples where the usual notions of single interface implementation are insufficient because I can mechanically pick out counterexamples

So to me, it’s very important that these interface specifications (whatever form they may take) compose well, in the sense that I can say “my struct Foo should dispatch both like Bar and like Baz, and I’m fine with having to clean up any ambiguities that arise because of that” (assuming you want your interface specs be dispatchable, of course :smiley: ).

Yes, absolutely. There’s two critical bits to a trait system in my view:

  • Conformance/implementation-side checking, where we ensure that any Foo that is a Bar and Baz dispatch exactly as Bar and Baz require. We can then leave behind breadcrumbs of conformance success to use for dispatch.
  • Dispatch checking, where we now need to resolve dispatch given the concrete implementation, the declared type signatures, and the breadcrumbs.

Doing both well is, in my estimation, very tricky because of how rich Julia’s type grammar is. You need to be able to deal with tuples, unions, union-alls, and invariant parametric types, the latter two of which pose a whole litany of problems. It’s sort of okay if you only worry about non-parametric traits that appear in covariant position, but spirals downhill rapidly from there.

It’s tractable, most likely, but is a substantial effort.

I don’t have a list, but the number of things that don’t forward missing has got to be humongous

I love that example - that’s perfect. Thank you!

On the upside, I have some (for now, private) work that could make it much easier to find these kinds of mismatches between what is expected & what is required, provided a human can code up a “this is what I think the method expects” version of some docstring.

Do you have some examples of how this looks? It sound quite neat.

1 Like

Two examples I mentioned in a previous post:

  • Need for explicit interface specifications: I could not figure out how to implement the IO interface to solve a simple problem (counting bytes written to a stream).

  • Need for dispatch on interfaces: I wanted to implement a fallback last method for iterators. The interface is documented, but I can’t define last(::Iterator). I could not find a backward-compatible solution that dispatches without runtime penalty so I gave up .

Another classical case I think:

  • The redirect_stdio documentation says

    redirect_stdout([stream]) -> stream

    Create a pipe to which all C and Julia level stdout output will be redirected. Return a stream representing the pipe ends. Data written to
    stdout may now be read from the rd end of the pipe.

    Note: stream must be a compatible objects, such as an IOStream, TTY, Pipe, socket, or devnull.

    What does “a compatible object” mean? Impossible to say since there is no definition of a “stream interface”!

    This has caused problems when the user wants to capture the standard output to a buffer. The IOBuffer documentation says:

    Create an in-memory I/O stream […]

    So you’d think IOBuffer works with redirect_stdout, but it doesn’t. It’s been reported since 2015 but it’s still an open issue.

    Probably if there was an explicit specification of the stream interface, this issue would have been easier to fix.

3 Likes

Right; I apologize, I should have been clearer in the original question.

What I mean by interface here is essentially the notion that’s talked about here; a set of methods that should be usable on or with some value or family of values that implementers should provide and users can use.

To “type signature, including type classes but not dependent types,” I think that’s an interesting question in and of itself. When I think of an interface I’m usually thinking about it in a “Julian” context where I’m allowed to do stuff to instances of some type and ideally if I do something stupid to that type that depends on the value rather than the type then I should get a sensible error message back. However, that’s entirely my interpretation: beyond that documentation there’s no reified concept of what an interface is in the ecosystem so in a sense there’s no wrong answer. There’s no reason why you couldn’t implement a sort of dynamically-checked dependent typing in Julia.

2 Likes

One thing about type classes that doesn’t get as much airtime but is still very important imho is that they’re supposed to have properties (“laws”), like

fmap id = id
fmap (g . h) = (fmap g) . (fmap h)

which I’d want to specify and ideally computer-verify in an ideal trait system. These “kinda acts like +” interfaces aren’t as helpful as they could be if we had a way of writing down what properties are actually expected.

For instance I think it’s currently unclear whether map(f, xs::T{A}) is supposed to return T{B} or if any Container{B} is allowed.

1 Like

StarWarsArrays.jl violates the AbstractArray interface because it does not follow the Bauman rule.

6 Likes

Yes, RequiredInterfaces.jl is more about designing with the API surface in mind ahead of development, than about fitting something after the fact (it can be used for that too, but it’s more difficult).

Don’t get too excited - it’s more in the direction of fuzzing than hard, static inference.

I don’t really understand this one for a few reasons:

  • Dispatching a method on interfaces as if they were types (the supertype of all iterators being a not very helpful Any) will cause brittle ambiguity issues like multiple supertypes would. Multiple Holy traits evade such issues by dedicating a trait function to check each trait set, and you can only have 1 trait per set. The trait function call follows type-based dispatch, so traits are not independent of types or dispatch limitations.
  • Assuming you’re not talking about the iteration-reliant last(x, n) methods, last(x) depends on lastindex, which is part of the indexing interface, not the iteration interface. Those often come together but you can have either alone.
  • How would a fallback last even work for iterators or indexables? Neither necessarily have a last element and could need to error upon a last(x) call, which will just work if lastindex isn’t implemented for the concrete type. The only change I could see interface/trait dispatch make is turning a MethodError for lastindex into a MethodError for last.
  • Upon rereading, it is clearer that you intended to implement a last method for iterables in addition to the method for indexables, and neither needs to be always successful. Unfortunately, dispatching on multiple orthogonal interfaces would have the same ambiguity issues as multiple supertypes, for example does an iterable and indexable input dispatch the call to the iterable method or the indexable method? The approach that comes to mind is to modify the existing method to check for the quicker indexing sense of last element then the slower iteration sense of last element (which has the HasLength and HasShape{N} traits of the IteratorSize trait set). I think this could be done by checking hasmethod on lastindex+getindex then length+iterate, and if Tricks.jl is right, this can occur at compile-time and be invalidated by method changes in v1.10. I have not verified this behavior or applicable’s, nor do I know exactly what builtins, intrinsics, and tfuncs are.

What do you mean, few (lasting) problems? I suppose that’s a good thing, if you need to ask people for trouble!

You mean a static-checker, similar to a linter, or Aqua.jl? It’s intiguing C++ is “particularly bad” for this despite a static language! Can you take examples of those classic ones, and for C++, and d you know if better in other languages than C++, e.g. D or Rust or Zig? 1:length, and at least C++ might have similar (o-based) issue, for strided arrays (one angle I didn’t think of), eachindex is also there the solution. I’m not sure but it might though not work in all cases, e.g. for lazy arrays/iterators?

I agree there’s great potential for ambiguities, but exactly what such issues would be depend on the actual design. In the last example, the Indexable interface could be defined as “including” the Iterable interface. The index-based method would then be treated as more specific which would resolve the ambiguity.

You can’t impose such a rule over a specific case if it’s not true in general. Indexing and iteration are orthogonal interfaces; their method sets do not subset each other. By contrast, you could say that the AbstractArray interface contains Indexable and contains Iterable.

Another problem is that a named interface often implies a rigid set of methods was implemented. However, as much as we want an easier, more formal way to determine what methods we need to make some existing method work, we don’t usually need to implement 1 particular set of methods. If a method only needs to read values from an AbstractArray, we only need size and getindex for it; immutable arrays (and other indexables) would and should never apply to a method that needs setindex!. Obviously we shouldn’t name an arbitrary method subset of an interface as another interface, that would get extremely complicated. We certainly don’t name a large variety of Union subtypes for dispatch; in practice we’d dispatch on (and alias) a few important ones at most. Type dispatch does have this ambiguity that methods that need mutable arrays and methods that accept immutable arrays both dispatch on the same AbstractArray; this ambiguity would be bad to replicate in interfaces that are all about what methods are supported.

Ordered checks in the same method was supposed to get around these issues: 1) you check if you can get a last index because that’s often faster, then you check if it is a finite iterable, and you want these checks at compile-time, 2) you only check the methods you need, not the entire interface. Putting compile-time checks (at the mercy of the compiler) early in method bodies isn’t regarded as satisfactory, but this design shortcoming is genuinely difficult to resolve. Personally I would like a reflection tool that takes a method and tells me the minimal method set I need to implemented per argument given some type constraints.

It’s hard to say for sure with the wishy-washy interfaces we have currently. But since indexing requires getindex, firstindex and lastindex I think conceptually at least it includes iteration. It’s definitely not orthogonal. And indeed AbstractArray provides a default iterate based on these methods. My guess is that if we did have dispatch on interfaces, the same default iterate would be defined for all indexables.

Now for the sake of argument, let’s imagine we really want an “indexable” concept that is not “iterable”. In that case we could define last for Iterable but not for Indexable since the semantics of last are related to iteration.

So I think to discuss the issues of ambiguity we need a better example than last.

Or maybe it’s exactly what we should do? (except for the “arbitrary” part.) It’s done in Go and works very well, but how well it works probably depends on the whole design. For example in Go the io package defines the following interfaces (among others):

type Reader interface {
	Read(p []byte) (n int, err error)
}

type Writer interface {
	Write(p []byte) (n int, err error)
}

type Closer interface {
	Close() error
}

type ReadWriter interface {
	Reader
	Writer
}

type WriteCloser interface {
	Writer
	Closer
}

type ReadWriteCloser interface {
	Reader
	Writer
	Closer
}

These are defined mostly for convenience and so that other people’s code looks familiar. For example they could have stopped at Reader and Writer, then I could still accept a “ReaderWriter” as function parameter by defining the combination myself, with a name or even without a name: func f(x interface { Reader; Writer }) { function body... }.

Maybe the same design wouldn’t work well for Julia, but whatever we do it should probably include different interfaces for mutable and immutable arrays.

Ouch! We already have imperative code organization with include that people have tried (unsuccessfully so far) to replace with a declarative solution. Let’s do our best to make declarative dispatch as powerful as possible, and keep “imperative dispatch based on reflection” as a workaround of last resort.

This is false. You can implement methods in the indexable interface without implementing any in the iterable interface, and vice versa. An indexable can also lack firstindex and lastindex because it has no semantic beginning and end, e.g. LinearAlgebra.I. Obviously without a start, I is not iterable. Immutable indexables like I also lack setindex!. The AbstractArray interface specifically includes both indexing and iteration; that doesn’t imply indexing and iteration are intertwined in general.

This is also somewhat false, there is a fallback last(coll) method based on lastindex and a fallback last(itr, n::Integer) method based on iteration. More specific methods of either arity can be defined however you like. For example, the non-indexable and infinite iterator from Iterators.cycle implements last(it::Cycle) to fall back to last of the wrapped instance.

An aside, last(itr, n::Integer) depends on Iterators.reverse, which isn’t in the leading table of methods but is mentioned at the very end of the iteration interface docs. That’ll be the faster way to do last for iterables than iterating from the start, and evidently from Cycle does not require a finite length.

Go eschews method overloading and multimethods to allow its type system and dispatching to even work. This is a fundamental design tradeoff.

I have no earthly clue what this could mean. The reflection I mentioned has nothing to do with how I would dispatch my methods, it would, for example, check last(::MyType) and tell me to implement lastindex(::MyType), or check last(::MyType(), ::Int) and tell me to implement Iterators.reverse(::MyType). Ideally it’d save me the trouble of looking up interface documentation and source code.

Note that I said “conceptually”. I’m mostly interested in the conceptual part since we don’t have a concrete design for interfaces yet. If we define formal Indexable and Iterable they don’t have to be a perfect match for the current informal interfaces (these will have to stay anyway until 2.0). But even restricting ourselves to the current informal interfaces I think indexable implies interable, see below:

The manual says that getindex, setindex!, firstindex and lastindex must be implemented. That’s all you need for iteration?

I’m just saying the semantics of last are related to iteration. Same for last(itr, n::Integer), it’s even in the docstring: “Get the last n elements of the iterable collection itr, or fewer elements if itr is not long enough.”

People should respect the function semantics when defining new methods. I think the Cycle case is a bug since it doesn’t respect the semantics of last (reported here, we’ll see what others think).

I gave this Go example to show that having many interfaces with bigger ones including smaller ones is a design that can work well, in the sense that the result is pleasant to work with. Maybe it would also work well in Julia? At least it’s not clear to me that multiple dispatch would be the crucial difference that makes it fail in Julia while it works in Go.

Sorry, I read that last sentence too quickly. I still had the fist part of your paragraph in mind. FWIW by “imperative dispatch based on reflection” I meant imperative code (with if, else, etc.) to handle dispatch, using reflection to make decisions. I think that’s a good characterization of the “ordered checks” you mention. Anyway I agee, I’d like that reflection tool too!

Can expand on this please? IMO the interfaces in Base are reasonably well-specified. In your own packages, you can be as precise as you like.