recent broadcast changes (iterate by default), scalar struct, and `@.`

As Matt said above, yes we could use applicable for this. There are multiple possible designs for more formal interfaces, and yes some of them correspond to applicable checks, but I’m not sure how likely we are to implement one of those designs.

Somewhat orthogonally, it would be nice to have a test suite for interfaces. Eg test if an object, which is supposed to support an interface (<: AbstractArray, iterate, broadcasting, …) actually does, with some simple checks. Some interfaces are nontrivial to define and test.

4 Likes

Fully agreed upon.
But as I said in Broadcast had one job (e.g. broadcasting over iterators and generator) · Issue #18618 · JuliaLang/julia · GitHub,
Why not a linter warning/error for f.(x, y, z) , such as “all arguments of f broadcast as scalars”,
if x, y and z are actually all scalars ?
Wouldn’t this solve your example, and provide a sensible rule ?

No, a warning does not solve anything. Once you see the warning, how do you change your code to make the warning go away, and select either scalar or iteration behavior? Warnings are an indecisive and unsatisfying approach to design problems. We’d basically be saying that we picked a certain design, but don’t actually like it, so there’s a warning when you use it. I also believe our current approach is entirely sensible, consistent for example with multi-argument map.

2 Likes

Please believe that I’m not contesting the design.
julia is amazing; I trust the devs intuition and design skills enough to accept
that there are other good reasons for the chosen behavior.

I wrote “warning/error”, thinking

  • warning if more dangerous broadcasting was permitted (allowing generic code),
    so warning off by default, but with the possibility to put it on for debugging purposes.
    Not arguing for that at all, just a possibility (well aware of strong arguments against that).
    Sorry if that made my point unclear.
  • error if we want to be strict and catch fully no-op broadcasting.
    This would in my opinion solve your sample case, catching mistakes,
    and allow for some scalars in the broadcast call
    (provided at least one argument is broadcastable).
    To answer your question, in case of error,
    I would just fix the code by removing the dot for instance,
    or know that something is wrong with at least one argument.

I know it’s just one corner case, but broadcasting over all scalars is well-defined and might as well be allowed.

That handles half the problem, but nobody is likely to write f.(x) when they actually just meant f(x). If you want to iterate over x, you’d need to switch to map(f, x), which is unfortunate since we want to be able to use dot syntax for that.

I think boiling it down, I’m just dissatisfied that we have to explicitly opt into the broadcast mechanism one way or the other. Every object that I pass into a broadcasted function must be explicitly opted in as a container or a non-container, and so it is not longer a binary decision; it’s a ternary decision with the states “container”, “non-container”, and “error”.

I agree that we want “the right thing” to happen when we use .(. I agree that there are tangential reasons why numbers, char’s etc… work with iteration. This actually makes me doubt that iteration and “is a container” are congruent ideas however, as I would not normally think that f.(1.0) should work, as I don’t understand why a single number should be iterable in the sense of it being a container (This coming from a programming perspective, not a mathematical perspective. I can see mathematical arguments for why numbers are points within a space and hence why they would be zero-dimensional containers).

The motivating examples I have can, of course, always be worked around by wrapping things in Ref() if they are an Expr, so it’s not that this is stopping work from happening; it’s more that it feels like things work pretty well in the numeric world because, as Matt Bauman states, An error seems sensible until someone makes a strong case for it to behave one way or another and either explicitly enables iteration or a scalar-like broadcastable definition, and these cases have been made for a blessed subset of types.

I personally would rather that we make the strong case for something to be iterable, and everything else gets treated like a scalar.

Matt, I’m afraid I don’t understand this. In my proposal, everything that is a container (e.g. has a Container trait, or implements the Iterable interface or something like that) would get broadcast over, and if you didn’t want that you would have to explicitly disable it via Ref() or something similar to that. There would be no way to take a scalar object and force it to be container-like, because that doesn’t make any sense.

2 Likes

The point is, if “scalar” is the default, it is quite possible to have an iterable object that didn’t opt-in to broadcasting (neglected to define the trait) and hence is treated as a “scalar”. It doesn’t seem feasible that any conceivable container type could be detected automatically. Then the only way to “force it to be container-like” for broadcasting would be to call collect to copy it to an array.

The converse, using Ref to force an object to be treated as a “scalar” (i.e. not a container), doesn’t involve copying large objects. If we allow &x to create a Ref in the future then the syntax for “escaping” broadcasting will be even cleaner.

Of course, in the current implementation, the default broadcast implementation for iterables does call collect, but it’s possible that more cases could be handled without a copy in the future. (For example, it seems to me that the single-argument case f.(x) could probably call something like map?)

(All this being said, I was long an advocate for defaulting to scalar, so I sympathize with that argument: it’s basically a balance between the number of types that need to opt-in vs. the number that need to opt-out, and it seems like there are more of the latter. I recognize the arguments on the other side, however, and in any case there is no way to pick a default that will please everyone.)

3 Likes

I don’t understand this, is being broadcastable and being iterable not synonymous? I.e., if it’s a collection of some sort I should be able to broadcast over it if I can iterate over it? Sure, there might be some collections where you intentionally can’t iterate over them (although I can’t think of an example at the moment), but why is assuming that you want to iterate over them better than saying “ok, they don’t implement the iterable interface and are thus treated like a scalar”? As I understand it, being iterable is already opt-in by implementing that interface…

1 Like

If we had a trait that collections/iterables had to implement, we could detect all of them, and require them to implement broadcast.

1 Like

It currently is synonymous by default — but it’s leading to these behaviors that folks object to. The best way to see if something is iterable is to try to iterate it (and that was really the only possible way on 0.7). Thus there’s a proposal to use a trait or some extra definition in addition to iterate that iterables should define to declare their iterableness, allowing everything else to be “scalar”. That’s where you get into trouble, though, because these two definitions can be out of sync. And they were effectively out of sync quite a bit on 0.6.

How and when? Have we ever done this before?

How: with a trait (as I said). We already use traits to indicate whether iterators have a shape and whether they have an element type. So we could have a trait indicating whether something is iterable at all. We would treat all other types (which also don’t implement broadcast explicitly already) as scalars (which is non breaking), and keep the existing behavior for those that implement the trait.

Maybe we should wait for 2.0 to actually require defining the trait as part of an official collection/iterable interface, but some parts can be introduced in 1.x.

4 Likes

This would be similar to having enforceable interfaces, but without actually enforcing the implementation, right?

@nalimilan Oh, I misunderstood what you were detecting. I thought you were detecting if the trait had been defined consistently with the existence of the iterate methods. That’s the crux of the problem. Such a trait needs a default value.

1 Like

I think the only realistic way to do this is to make it totally automatic, either by calling applicable or adding a new trait system that’s significantly better than what we do now (which would either not require manually adding definitions, or would add enough value that it would be worth changing code).

I don’t think this is comparable to the existing iterator traits, since those are basically tweaks. Things generally work without them.

8 Likes

Adding a vote for having bare structs be treated as scalars.

This should just work without having to explicitly wrap the struct in anything:

struct Foo
    x::Int
end
a = Vector{Foo}(undef, 2)
a .= Foo(2)

I understand from this thread there are big design choices and it may take some time to find the right solution. But the current form is non-intuitive behavior.

1 Like

As a newbie to Julia, this was a surprise to me too. I often want to apply a function to every element of an array, and pass a second argument at the same time. In Python I’d use

[foo(x, y) for x in xs]

Now foo.(xs, y) works great when y is a builtin type, but for my own types I either need to Ref them or set broadcastable for them all. Both of which feel like a pain for such a common operation. Am I missing some Julia idiom?

1 Like

There was discussion at some point to make &x be syntax sugar for Ref(x) but I think right now an explicit Ref is the best you can do (or (x, )).

2 Likes

Thanks for confirming @yurivish!

Reading the various Github threads on this, @mbauman’s post here explains why things are this way the most clearly:

In short, there are four options that avoid the incorrect fallback:

  1. require everything to implement some method that describes how they broadcast
  2. Default to treating things as containers and error/deprecate for non-containers.
    • We will just try to iterate unknown objects and that will error for scalars
    • There are two escape hatches for scalars — users can wrap them at a call site and library authors can opt-in to unwrapped scalar-like broadcasting.
  3. Default to treating things as scalars and error for unknown containers
    • Given that there are no relevant methods only defined for scalars, we’d have to assert that iterate throws a method error. That is slow and circuitous.
    • There would only be one escape hatch available for custom containers to not error: their library authors to explicitly opt-in to broadcast. This seems quite backwards for a function whose primary purpose is to work with containers.
  4. Check applicable(iterate, …) and switch behaviors accordingly
    • This currently doesn’t work due to the deprecation mechanism from start/next/done, and in general could be wrong for wrapper types that defer methods to a member.

I think I understand the reasoning against (3): when scalar is the default, to know that an argument should be treated as a scalar Julia needs to test if iterate exists, and that’s slow. And any amount of slowness in broadcasting outweighs a whole lot of syntactic pain.

But man is that syntactic pain frustrating. Knowing nothing about Julia’s internals, I also like the default-structs-are-scalars idea, but I assume there’s some trickiness about doing that which has stopped it happening.

Yup, I’ll also note that the broadcast redesign occurred relatively late in the 0.7/1.0 release cycle. This design was a conservative “grand bargain” that was the culmination of lots of discussion. Erroring here — while annoying — was a strong feature in coming to that consensus. It bought us time and reserved space so that it’s possible to potentially implement a better default (in place of the error) as a non-breaking feature in 1.x.

6 Likes