Structural subtyping

marvinvanaalst · July 9, 2021, 1:53pm

Hey guys,

I just ran into a problem where I had a runtime issue where some code tried to iterate over a non-iterable type and then I ran into the definition in the standard library.

struct Enumerate{I}
    itr::I
end

enumerate(iter) = Enumerate(iter)

function iterate(e::Enumerate, state=(1,))
    i, rest = state[1], tail(state)
    n = iterate(e.itr, rest...)
    n === nothing && return n
    (i, n[1]), (i+1, n[2])
end

If I understand that correctly Enumerate is defined for every type (even though the name I implies that the type should be iterable), which would be mathematically incorrect.

It only fails at runtime when the iterate function isn’t defined for that type. I know that in Rust iterate is only defined for types that implement the iterable trait and even in python + mypy you can check if a class has defined __iter__, otherwise you will get an error even before compile time. So why is Enumerate defined in such a way and how can I define a method to only accept types that implement certain methods?

Thanks and cheers!

findmyway · July 9, 2021, 3:33pm

You’re correct here.

Yes

I guess this is quite common in dynamic languages?

I’d recommend you read this blog, especially part 3 and part 4:

https://www.oxinabox.net/2018/10/03/Dispatch,-Traits-and-Metaprogramming-Over-Reflection.html

marvinvanaalst · July 10, 2021, 6:22am

Hey, thanks for the quick reply!

Having incorrectly typed functions in the standard library does seem like a major issue to me though, especially when it comes to people who may want to proof the correctness of their programs.

I checked your link, but most of the given solutions just seem like hacks for a problem that should usually be rather straightforward. Structural subtyping for me is about automatically checking whether functions exist for a given type, so the only solution that really fits that description is the hasmethod example. But if that this only works at runtime it seems a major drawback for tooling, as I would like to do this statically.

The solution

scalarness(::Type{<:AbstractArray}) = NonScalar()

for example to me is just monkey-patching a nominal type system to give me the behaviour of a structural type system, and then only works for the types I’ve actually defined this for, so it might give wrong behaviour for a lot of other types. Sure, its easy to extend it if the user knows on which function to dispatch, but that doesn’t change the fact that it just shouldn’t be necessary for the user to write any extra code for this in the first place. Also the example doesn’t differentiate on methods of a type (quacks like a duck), but just on whether a type is array-like (looks like a duck), so it doesn’t fit the usual definition that a trait is a set of methods.

I wonder if the strategy works at all for more complex traits, if for example you need to check for types which need to have 5+ methods defined. Then you would either need to manually check yourself if all of them are defined and only then include your type in the dispatch, and at that stage we have come a very far way from where we wanted to go, or live with the runtime cost of the hasmethod check and runtime dispatch, which depending on the task might be problematic.

Cheers and thanks again for the fast reply!

findmyway · July 10, 2021, 7:24am

I remember there used to be an issue discussing this. But I can’t find it right now. Maybe someone else could answer it.

(While searching for the related issue, I found one that may be interested to you: Abstract iterator type? · Issue #23429 · JuliaLang/julia · GitHub)

tisztamo · July 10, 2021, 1:36pm

I think that the term “incorrectly typed function” is not very useful in Julia. AFAIK it is considered best practice to use function signatures only for dispatch, and not for safety or other motivations of restricting.

The language is dynamic, thus certain problems manifest only at runtime. This is by design. Moreover, there is no separate compile time in Julia, code generation is just intermezzo of execution. “Only fails at runtime” is something most of us stop saying after a few months of working with Julia.

Yes, static analysis of the language is relatively hard, especially when you compare it to less dynamic languages. This is a sad result of runtime code generation, which is one major selling point of Julia. I hope that tooling will help more in the future.

Yes, proving correctness in Julia is even harder than in other languages. I personally think that proving correctness of nontrivial programs requires special languages and tooling, and is futile in any general-purpose language, so I don’t feel this a problem.

Please also note that structural subtyping is not a feature of Julia, at least in my understanding. Duck typing is.

marius311 · July 10, 2021, 8:28pm

Tricks.jl has a static_hasmethod which is compile-time. Composed with WhereTraits.jl, you can already do exactly what you want decently succincinly (and surely even better with a little more syntatic sugar on top):

using Tricks, WhereTraits

struct Foo end
struct Bar end

baz(::Foo) = 1

@traits f(x) where {static_hasmethod(baz, Tuple{typeof(x)})} = baz(x) + 1

f(Foo()) # 2
f(Bar()) # method error

and note the Foo path is completely inferred / optimized away (unlike it would be with just regular hasmethod):

julia> @code_llvm f(Foo())
;  @ Untitled-2:9 within `f'
define i64 @julia_f_3296() {
top:
  ret i64 2
}

jling · July 10, 2021, 9:27pm

it’s not incorrectly typed, it’s not typed at all. enumerate(iter) takes Anything.

But consider this is like saying for x in BLAH shouldn’t call iterate(BLAH) because Julia’s base can’t prove BLAH is iterable. Well Julia is a dynamic language after all and that’s an interface for developer/user to extend for their own types. (if BLAH is designed for such a thing at all)

jzr · July 10, 2021, 9:52pm

Afaik the reason you can do

julia> struct Foo end

julia> enumerate(Foo())
enumerate(Foo())

isn’t based on principle since it doesn’t have much (any?) benefit. It’s more just due to the technical reason that Julia hasn’t figured out its trait system yet.

I think Julia’s sparse use of type signatures is a significant problem for correctness, composability, and documentation.

I don’t agree with using

IMO, the solution is likely to be a trait system (like WhereTraits.jl), but there are some significant issues that need to be worked out before such a system can be widely deployed.

A related issue that Julia has is that it can’t currently represent higher order functions in the type system. For example, it’s not currently possible to write a signature for map. This makes it hard to write more sophisticated functional structures because there’s no help from the compiler and no signatures to guide the author.

jling · July 10, 2021, 10:26pm

IIRC Julia never intended to have typed function if you’re thinking about OCaml or a like, not just “currently”. Julia’s typing system is built for multi-dispatch, not for proof etc. in the realm of type theory. It doesn’t mean compiler doesn’t use the information to speed things up, it’s just that we’re a dynamic language with emphasize on technical computing after all.

jzr · July 10, 2021, 10:57pm

For future readers: some folks have been looking for (and working on) stuff like that

marvinvanaalst · July 11, 2021, 7:49am

Wow, thanks for all the input and the lively discussion!

I’m sorry, but math doesn’t change because of language design. Also the function fails for certain inputs, so even if you use it only for dispatch its still broken. So the type of that function just is problematic.

But if I’m writing libraries for other people that’s a situation I would like to avoid at all cost! Just because you can dynamically extend things doesn’t mean that they have to be broken statically. Especially since the latency in julia is so high compared to other dynamic languages, having the (code → JIT compile entire package → fail) cycle repeat just takes too much time. Especially for simple errors like name or type mismatches, in which case a type checker can give you immediate feedback for.

Cool, we are getting in the right direction! And now at static analysis
I quickly checked the source and it seems to be only defined for a single method. But it seems like it could be extended.

No, it is implicitly typed with Any and that type is wrong.

Base doesn’t need to prove BLAH is iterable, but base in my opinion should provide an interface of what “is iterable” means and then types have to follow that interface. In that way type checkers or the compiler can actually check whether the action you want to do is legal in the first place.

Ok, I’ve read the dynamic language part now multiple times, but there just isn’t any reason for a dynamic language not to provide static analysis at a certain point. If you are writing a library, static analysis is just incredible useful and in my experience greatly improves my productivity. In Python for example you can type-check your library with tools like mypy to iron out the bugs and at runtime keep on dynamically playing around with everything in an unchecked fashion. So the dynamic part just isn’t at fault. My gut instinct is that the reason all this static analysis is so hard in julia is not because its a dynamic language, but because of using multiple dispatch everywhere, even when its not necessary to do so, especially in combination with the include statement. But I might be completely wrong about that, that’s why I started this thread. Also some people here tend to get a bit allergic when someone says that multiple dispatch might not always be the best idea, so I tried to keep it out of this conversation so far to keep everything civil
Also please don’t read this as me disliking multiple dispatch. I think its a great tool for the tasks where you actually need it. But for the times where you don’t, I feel like it would make sense to evaluate whether the costs associated with doesn’t justify choosing less powerful alternatives for the task at hand. Just like you wouldn’t use a sledgehammer for a small nail
But please let’s not get too sidetracked on this discussion, the main one is interesting enough

Again, thanks for all the input guys!

jzr · July 11, 2021, 8:15am

Static analysis is newly available with JET.jl.

But I don’t think that’s the whole story. I want to have a well typed system (with clear interfaces), not just static analysis.

I also don’t think include or multiple dispatch harms much – or is even really related. Nor is the dynamism at fault.

marius311 · July 11, 2021, 8:17am

Just in case it’s not obvious, you can put any expressions in that where clause, so you can already do where {static_hasmethod(a,...) && static_hasmethod(b,...)} etc… With some helper functions, willing to bet it can be done pretty succinctly.

Not sure your familiarity with Julia but just since it’s your first post here (welcome to the community!) and since you mention this, make sure to check out Revise.jl which removes this cycle entirely except when you change the layout of types.

Also, fwiw, you might find the discussion in https://github.com/JuliaLang/julia/issues/6975 and links therein interesting.

tisztamo · July 11, 2021, 9:08am

A lot of very interesting topics came out here, thank you all for the links and thoughts!

Interestingly, I start to have the feeling that our vocabulary differs so much that it is hard to understand each other.

e.g. This great article talks about static typing, which is roughly using function signatures to restrict dispatch in the hope that errors can be found without running the program:

On the other hand static_hasmethod is semantically equivalent with hasmethod, it is only called static because it is compiled away thus has no run-time overhead. That is an entirely different type of staticism.

The retval of static_hasmethod cannot be inferred without executing the program completely, because adding new methods to the queried function will trigger recompilation of the call site and change the returned value.

So I double down on that the main problem is dynamism in the sense of dynamic code generation and code loading, but it would be great if you could persuade me that’s not the case.

jzr · July 11, 2021, 9:21am

It would be interesting to know if JET can reason about static_hasmethod.

marvinvanaalst · July 11, 2021, 9:44am

Silly me, I completely missed that. Thanks for pointing that out!

Yes, this is exactly what I am looking for. I don’t care much about restricting doing stupid stuff at runtime, as I like to have freedom there, but I would like to be able to do those easy checks quickly while writing libraries.

jling · July 11, 2021, 10:42am

next thing you gonna tell me print() shouldn’t take Any either because we don’t know something is showable because we don’t have an interface. It’s just Julia is not Rust/C#/Swift.

not sure what you mean, our typing system is dynamic, period. Static analysis can be provided by external tool (see JET.jlmaybe?), but the typing system of the language is jut NOT static so you don't want to lock down things likeiteratebecauseBase` is not a compiled and statically linked library.

marvinvanaalst · July 11, 2021, 11:19am

No, print is perfectly ok with Any as the fallback show or more specifically _show_default is defined for Any. I just dislike having Any for those functions where it simply isn’t true.

CameronBieganek · July 11, 2021, 3:39pm

@marvinvanaalst Welcome to the Julia Discourse!

Iterability is a trait, not a type. There are many unrelated types that are iterable.

Iteration is a programming concept, not a math concept. The closest thing to iteration in math is recursive definitions, but they’re not quite the same thing as the programming concept of iteration.

If you want to fully prove the correctness of your code, you will need to use a language with dependent types, like Lean, Coq, Agda, or F*. None of the mainstream programming languages provide this capability.

As @tisztamo noted, semantically there is no compile time, only runtime, because Julia is a dynamic language. The JIT is an implementation detail. So it makes no difference if we move errors from runtime to “compile time”—you will still only see those errors if you run your code (or run unit tests).

If we run the example from @jzr, we get a runtime MethodError from the iterate function:

julia> struct Foo end

julia> for (i, x) in enumerate(Foo())
           @show i
           @show x
       end
ERROR: MethodError: no method matching iterate(::Foo)

Now suppose I add my own isiterable trait, as follows:

isiterable(::T) where {T} = isiterable(T)
# Default to non-iterable:
isiterable(::Type{T}) where {T} = false

function traitful_enumerate(iter)
	if !isiterable(iter)
		throw(ArgumentError("The object provided is not iterable."))
	end
	enumerate(iter)
end

julia> traitful_enumerate(Foo())
ERROR: ArgumentError: The object provided is not iterable.

I’ve traded a runtime MethodError for a runtime ArgumentError. So I’ve added code complexity for only a very minor gain in the readability of the runtime error. It’s already pretty clear that the error

ERROR: MethodError: no method matching iterate(::Foo)

means that Foo is not iterable, so changing it to an ArgumentError with a slightly different error message doesn’t really change much. I think this is the reason why Base Julia does not provide an isiterable method—because it adds one more method to the interface for iterables but doesn’t provide any real value.

CameronBieganek · July 11, 2021, 4:42pm

In general, Julia takes a “let it fail” approach to generic programming. I think it would be pretty onerous if I had to write functions like this:

function foo(x, y) where {isaddable(x), isaddable(y)}
    x + y
end

This would quickly get out of hand:

function foo(u, v) where {
		isiterable(u),
		isiterable(v),
		isaddable(eltype(u)),
		isaddable(eltype(v))
	}
	map(u, v) do a, b
		a + b
	end
end

Topic		Replies	Views
Using types to enumerate New to Julia	5	553	February 13, 2023
How do I define a function whose argument is iterable? New to Julia question	5	905	July 3, 2020
Using iterator in Julia 0.7 General Usage	2	634	June 8, 2018
Define iterator New to Julia iterators	11	1025	February 18, 2022
Super type for all things iterable? General Usage	5	681	November 20, 2018

Structural subtyping

Related topics