How can we maintain easy generics in the face of demand for a “safe” language?

Julia is moving in the direction of having a static sub-language and even things like enforceable interface/etc. However, if we are not VERY careful, we can lose one big advantage that Julia has, easy generic.

Duck typing allows easy, intuitive generic code, for example, if you want to print foo, you do

function print_foo(x)
println(foo(x))
end

However, with strict interface, here goes something like…

function print_foo<fooable T>(T x)
printable foo_of_x = foo(x)
println(foo_of_x)

And it errors out anyway because how do you know that foo(x) is printable?
If you specify that foo must return a printable type… well… issues again, because what if you want to foo other types? Or you wanna do something like…

fooable<printable> 

And then it gets complicated. What if the foo of the object is another type which is itself also fooable and so on?
And then there are all sorts of abstract algebra that can solve problems, but introduce lots of complexity or unfamiliarity.
We already have safe languages like Rust/etc.

What Julia offers is fast, generic, easy code. That is something I believe we should hold on for dear life. A fully type-inferred Julia is a lot like GC-ed, convenient C++. Both languages have lots of supports for fast generic. C++ has sfinae/decltype/etc. Julia has dynamically typed semantic that compiles down to static code with multiple dispatch. C++ has template. Julia has parametric types.

So, with that in mind, how do we address the safety of the language then?

P.S. This topic might be too broad. Once we get the directions going, we can split this thread into multiple threads and close this thread down. I’m just concerned about the future of Julia generic.

4 Likes

I don’t know enough about the ongoing interfaces debate, but as an avid end user I pray that Julia doesn’t wind up requiring the kind of clunky syntax I often see in modern C++ and other similar languages.

5 Likes

There’s not too much sense in preemptive hand-wringing here. There’s not even a concrete pull-request or proposal for what interfaces might look like.

19 Likes

Static interfaces are not at odds with generic code. In fact, the opposite is true: interfaces are how you make generic code safe.

Consider

function mynorm(v::Vector{Float64})
    r = 0.0
    for i = 1:length(v)
        r += v[i]^2
    end
    return sqrt(r)
end

This is not generic, but perfectly safe, as you know all the properties of a Vector{Float64} (not that I’m saying this is elegant code). The generic

function mynorm(v)
    r = 0.0
    for i = 1:length(v)
        r += v[i]^2
    end
    return sqrt(r)
end

on the other hand is very very buggy, as it makes unspecified assumptions about v (that it is a one-based AbstractVector of numbers).

The point of interfaces is to allow you to specify all the assumptions that you make about the inputs to your function and about the return values of any function you might call. Ideally, in a way that the compiler or a static code analyzer can enforce.

Without this, you are left with putting your assumptions in your documentation, or worse, not being aware of your assumption. Such “generic” code has an unavoidable potential for correctness bugs. In fact, that hidden assumption for one-based AbstractVectors has been one of the major criticisms of Julia whenever the issue of correctness has come up.

I would note that your example with println is not a good one, as the Julia language guarantees the printable for all types.

11 Likes

That idea could be true in theory. In practice, interface often means you make an overly conservative assumption, or that you need some sort of concepts to capture all sorts of complex assumptions, which require abstract algebra. For instance, there are some things whose assumptions are the associativity of X over operator op where X and op can be anything.

That idea could be true in theory.

Very much true in practice, and one of the top things people complain about whenver Julia makes it to Hacker News or some similar forum where people like to complain a lot.

In practice, interface often means you make an overly conservative assumption.

No, you declare exactly the assumptions you make in your code. If anything, I find people tend to have difficulty with making implicit assumptions explicit, not the other way around. Including myself: I had to edit my post to add that the eltype of v has to be a number.

For instance, there are some things whose assumptions are the associativity of X over operator op where X and op can be anything.

If your code assumes associativity, then you better well make sure that your inputs have associativity. If you don’t, you have a correctness bug. Whether some future traits system in Julia can express that kind of condition is another question (although those algebraic traits don’t seem like they would be very far out). Meanwhile, you have to rely on tests and explicit verification, and a judicious use of @assert statements.

2 Likes

They do often design things too conservatively sometimes, even if they don’t know it. For example, does the Java type have the information that the foo of bar of baz of something can be multiplied by an integer if baz of something is not always the case? Then you have monad/etc pattern to capture concepts.

Regarding people making assumptions too conservative. Here’s the thing. They often do. For example, one not knowing abstract algebra might think it is generic enough and assume all who use it are those who use numbers, even though the algorithm may apply to all algebraic rings.

It’s a reasonable guess, but strictly speaking it actually doesn’t make those assumptions. What mynorm(v) presumes is 1) length(v) is supported, 2) getindex(v, i) is supported for i in 1:length(v), and 3) ^(v[i], 2) is a number that can be added to r::Float64. v does not need to be an AbstractArray, let alone 1-based. For example, you could easily replace v with an AbstractDict. getindex could give a default value for absent keys. ^ could take any type as the first argument as long as it processes it to most real numbers.

That also isn’t the typical way to generic-ize mynorm(v::Vector{Float64}). You would do:

function mynorm(v::AbstractVector)
    r = zero(eltype(v))
    for i = eachindex(v)
        r += v[i]^2
    end
    return sqrt(r)
end

No need to assume 1-based indexing or numerical element types, and the dispatch will exclude non-vector types. You still can’t infer much from the method alone statically, you really need a call signature.

3 Likes

Yeah… though you actually didn’t need an abstract vector type either. A dict would also do.

You could, but in my opinion it’s not sensible to expand to non-vectors from Vector{Float64}. Then again, it may not be sensible to expand to nonnumerical element types from Float64. That can be taken care of with function mynorm(v::AbstractVector{T}) where T<:Number, and that can replace the eltype(v) call with T.

I think it’s sensible to expand to tuples for example, in case someone uses tuple to represent a fixed size vector. (or dicts to represent vectors whose dimensions are named.)

Julia should aspire to look like Swift, not C++.
https://docs.swift.org/swift-book/documentation/the-swift-programming-language/generics

Swift has had a very basic numerical package for 5 years, but it hasn’t improved much since its release.

The swift language still requires you to write the allowed operation in advance, which can get clunky in complicated cases. C++ template allows you to generate as many instances of functions as you need.

Ok, fine, people both over- and under-estimate the implicit assumptions in their code :slight_smile: I don’t want to get too hung up on the (contrived) norm example, and I’m aware of eachindex and other ways to safely write more generic code. The point I was trying to make is that naive “generic” code has the potential for pretty catastrophic correctness bugs.

When you say

For example, one not knowing abstract algebra might think it is generic enough and assume all who use it are those who use numbers, even though the algorithm may apply to all algebraic rings.

it makes me a little nervous. Being “conservative” is just good defensive coding practice. It’s better to be safe than sorry: don’t assume code is more generic than what you’ve tested it against. If I was mentoring a Bachelor student with little coding experience, I would absolutely tell them to annotate all their inputs as e.g., Vector{Float64}, rather than AbstractArray or leaving the code un-annotated. Even for myself, if I’m prototyping or writing one-off code (with no tests), I’d use the narrowest possible type annotation and try to add @assert statements for any implicit assumption.

At the time that code makes it into a library for more general use, hopefully it’ll have been reviewed and tested more thoroughly and can be more general. But even then: being a little conservative is still good. Don’t promise AbstractArray if you haven’t thought about and tested the code with OffsetArrays.

If there’s some library you want to use in your project, and it turns out that they were too conservative in annotating their inputs, that’s not really a problem: Open a PR that adds tests for the types you want to call their functions with and relax the type annotation. That works out fine for the ecosystem. On the other hand, a widely used library that overpromised AbstractArray but then silently returns wrong results when passed a zero-based array is a major problem and has given Julia a hard-to-shake bad reputation.

Nobody wants excessive boilerplate code like C++ and Java (and I don’t think anything like that would ever get off the ground to make it into a future Julia 2.0). But a good traits-system with automatic inference, whatever that may end up looking like, would be a major help in safely writing more generic code. In fact, my gut feeling about a traits system is that it would result in less boilerplate, not more. Traits are potentially a lot more expressive than types, and open the door to a whole new level of static code analysis. And Julia desperately needs more and better static code analysis tools.

What Julia offers is fast, generic, easy code. That is something I believe we should hold on for dear life.

Nobody would disagree with that, but enforceable interfaces are what makes such code safe. What you’re worried about is boilerplate, but traits/interfaces don’t necessarily imply boilerplate code, and Matt’s point about preemptive hand-wringing applies until we have something more concrete on how a specific traits implementation might work.

5 Likes

In your examples you didn’t annotate return types and I know the manual discourages it in favor of type stable code. Isn’t that a quick and easy way to add more safety and for static analysis?

Yes, I’m actually not sure why Julia discourages return type annotations as much as it does. Not that I’ve used them myself much. I guess partly because the syntax is cumbersome (hard to format within a maximum line width, whether that’s 92 or my preferred 80). I do share the gut feeling that more return type annotations would probably be better. :person_shrugging:

2 Likes

That’s true, generic code isn’t as easy as just widening argument annotations. If you can reasonably determine that you won’t need your code to be generic, then it shouldn’t be. Annotate Vector{Float64}, do all the 1-based indexing math.

Lots of base Julia code like in LinearAlgebra require 1-based indexing and throws an error otherwise, it’s done by a call in the function body that can be resolved at compile-time. So that’s an extra limitation to account for or incorporate.

A method doesn’t have a fixed return type by default, its various specializations often depend on the argument types. Annotating a return type usually ends up as a convert step for most of those specializations, but it’s more often desired to avoid such overhead and to vary return types, hence type stability. The good thing is convert steps are optimized away if the compiler proves it unnecessary, so you can annotate return types just for documentation if you don’t design the method to be type-generic, at most having the return type match some type parameter. Downside of that is edits to the method that changes types or introduces instability can be masked by the convert step, so tests would need to include internal calls as well. That is safety in a sense because the edited code keeps working with some extra overhead.

Ah, right, I forgot that return type annotations are convert statements, not type checks, making them much less useful.

2 Likes

Yep, type checks are done in tests of calls. To reiterate, method signatures have insufficient information, we need call signatures to check type correctness.

I should also mention it introduces a typeassert step too because convert isn’t guaranteed to result in the target type (usually fails with an error). typeassert is what informs the compiler.

1 Like

In a sense, Julia already has interfaces. Its just that they are informal (ie duck-typing) and it is hard to know when you have satisfied enough of the informal interface to be a plug in replacement for the canonical implementation. This can be frustrating and buggy when attempting to build a drop-in alternative. Some of the informal interfaces (like Arrays and Tables) are well documented and the community has built many alternative implementations. It would be great to have a formal way to specify interfaces in Julia if just simply to define what it really means to be a drop-in replacement.

I could imagine a simple interface definition that didn’t do more than codify what is already being done informally. This would be a great bonus for the language and make it much more predictable (ie safer).

2 Likes