The strangeness (or not) of * as string concatenation

Operators maybe shouldn’t themselves be considered methods, and shouldn’t be expected to have the same behaviors on all types that use them. They feel more like syntax, and could operate that way.

Base even takes advantage of the same syntax doing wildly different things depending on type: e.g., A[1, 2] is both getindex and typed concatenation, depending on whether A is a type or not. Which means there’s actually a getindex method in Base that does not get at an index. :upside_down_face:

IMO the lowering should not be to getindex, but rather to something like prefixed_bracket which then dispatches to getindex for the right types. And there’s not anything wrong with that, because there is no real meaning attached to what prefixing a bracket should do.

There’s no inherent meaning to asterixing something or slashing it, either. So in principle, I don’t see a problem with operators/syntax doing different things for different types. They should absolutely have the same meaning within a type tree, but the guarantees Julia implies could end there.

So I say let / split strings and join path objects! If you want accuracy with a consistent contract when you call it, use an appropriately named method and pay the cost of extra typing.

4 Likes

I see a hierarchy of overloading

  • no polymorphism: f(x) and f(y) can be defined only if both use the same assembly code
  • parametric polymorphism: f(::A) and f(::B) can be implemented only if both use the same julia code
  • different code implements f(::A) and f(::B) but both methods satisfy a shared contract cf(_)
  • YOLO overloading: f(x) and f(y) can be implemented if they have similar vibes or use the same English word

Needing different operators for float vs int multiplication is an option for any language (ML does that) but it’s not a good choice for Julia imo. But I’d draw the line before YOLO overloading. (Related literature: “How to make ad-hoc polymorphism less ad hoc”.)

What cases do you have in mind? I’m skeptical because “is an element of” and “is a contiguous-subsequence of” don’t share any specification afaict—that they both use “in” is just an imprecision of natural language, not reflected in any formal description of the functionality.

is_element_of(y, xs) = any(==(y), xs)

is_substring_of(y, xs) = any(==(y), substrings(xs))

I agree. Multiple dispatch requires taking semantics seriously.

Regarding DW’s previous comment:

Other languages don’t use multiple dispatch and therefore do not need to take semantics seriously.

Without knowing the types it is applied to, + means something. For multiple dispatch to scale, all functions that call + on arbitrary types must rely on that meaning – and not on an arbitrary implementation of + for some particular types.

If julia had chosen + instead of * for string concatenation, then one of the most basic operations in the language would fail with multiple dispatch, even though multiple dispatch is the biggest design choice of the language.

In my mind, a string behaves like an ordered set of characters, so I would use a ⊆ b for occursin(a::String, b::String), and reserve a ∈ b for the case a::Char and b::String.

2 Likes

For multiple dispatch to scale, such functions cannot make too strong assumptions about adherence to an unspecified and unenforced contract, such as Base.:+ always being commutative. I think we established above that it’s fine for reducers to assume this for SIMD-able number types, but not in general.

No, that’s just not true. It makes absolutely no difference for this kind of argument whether a language is single-dispatch (like Python) or multiple dispatch. In single-dispatch languages the only difference is that for a + b, the first argument a controls the implementation of +, by default (in Python, via the __add__ dunder method). The expression a + b is exactly as generic in Python as it is in Julia, and you cannot say anything about what the result of a + b is unless you know the type of a and b. This does not lead to any kind of problems.

It means pretty little: “addition operator”, according to the docstring in Julia. There’s really not much of a contract implied there for what “addition operator” means. As discussed previously, you can’t even blindly assume that a + b is commutative. I mean sure, you can have a strong convention that + should be limited to mathematical summation of number-like objects (the details of which still entirely depend on the exact type of the objects). I just find that languages that use more “operator punning” (use + for more than just numbers, with meanings different from just “mathematical summation”) end up being more pleasant to use. But that’s a matter of opinion. But the situation is exactly the same for Julia and Python. If a + b being implemented for strings doesn’t cause problems in Python, it wouldn’t cause problems in Julia, either.

And + is a trivial function. For less trivial functions, I would say it is impossible to come up with a generic function docstring that encompasses all the methods for all possible types or arguments in the entire ecosystem. It is completely normal (also in core Julia and the standard library) for method docstrings to augment the function docstring, specifying additional properties, or even deviating significantly from the generic docstrings. That’s why methods can have docstrings, not just functions. Of course, it’s a good guideline to write methods that don’t need docstrings (that just implement the behavior specified in the generic function docstring).

Arguably, Julia has to worry about this issue less than Python, though: In Python, type annotations are still completely optional. So, if you’re not using a type checker, and you’re writing a function f(a, b) that contains a + b somewhere inside of it, you cannot make any assumptions whatsoever on what a + b evaluates to. In contrast, in Julia, you at least have the option to declare that as a method, function f(a::Float, b::Float), and then you do have a very clear contract about a + b. Or, you could say f(a::Number, b::Number), and you’d still have a reasonably good contract (it’s a mathematical sum), but you wouldn’t be able to assume, e.g., commutativity. And, of course, you could have f(a::AbstractString, b::AbstractString), and that would be completely safe if +(a::AbstractString, b::AbstractString) was defined, even if that method was completely different from the generic +.

If we had a type system that was more powerful and included trait-like features, f(a::Number<:CommutativePlusTrait, b::Number:<CommutativePlusTrait), the entire problem would be pretty much solved.

Also, maybe we’re forgetting to state the obvious: Using * for string concatenation is just as much a violation of the generic multiplication operator as + would be a violation of the generic addition. And indeed, * has a whole bunch of different method docstrings that deviate from the generic “Multiplication operator”. Going back to through the 10-year old discussions on GitHub about whether or not to change * for string concatenation, that was exactly Stefan’s position at the time. I think that’s a perfectly respectable and consistent position to have. I don’t share it, but it’s a question of mentality, not of any objective truth.

So, basically, If the argument is, “Let’s be conservative about operator punning, and not overload *or + for strings, but let’s use ++ or .., or whatever”, I’d say “Sounds good!”

If the argument is, “We’re going to use * for string concatenation, no reason”, then my response is “Fine. That might take a little getting used to, and I’m not sure it’s great that Julia is the only language using that operator. Maybe it’s better to borrow an operator from any of the n most popular languages. But it doesn’t really matter.”

If the argument is “We’re gong to use * for string concatenation; footnote: because of non-commutative monoid” (the situation we’re in), then that’s going to raise an eyebrow. The healthy response would be the same as for “no reason”. Accept it, move on with your life (the answer to the original topic).

If the argument is “* is the only appropriate operator for string concatenation, because of abstract string algebra in theoretical computer science”, then I’m starting to have serious problems. Because then we’re in the territory of “I need to understand abstract algebra concepts to work with Julia / Julia is only for PhDs”, which is not a good place to be in.

And if you then go further and say "because we have *, abstract algebra demands that we also have / (despite there not being a multiplicative inverse for strings), then I think we’ve totally lost the plot, and we end up with "Oh, then we must also have the same algebra with * and / for Path objects, and then p"folder/sub" / p"sub" ends up meaning the exact opposite of what it means in literally every other programming language that has Path objects.

3 Likes

I can be the first, and it’s an opinion I’ve heard often. If people learn the substring test for Python’s in before the membership test in the general case, they tend to assume it’s a subsequence test in general because Python is often misstated to let users not think about types. The very common question “how do I find a sublist” often starts by running into something like this:

>>> "el" in "hello"
True

>>> list("el") in list("hello")
False

I personally would prefer these to be different functions, and it would be more in line with other string processing methods like the similar str.find. To be fair, different purposes for more specific types isn’t that unusual generally, and it’s well documented. Julia can and does separate different purposes into more specific methods, e.g. * for string concatenation, matrix multiplication, general multiplication, etc, but that could be justified by precedence in other contexts, and I don’t necessarily agree with any and all method specialization choices in Julia either.

Ultimately, these things are usually documented well, and users have a responsibility to properly learn things, not expect everything to be intuitive. We should try to make things reasonably intuitive, but intuition is fundamentally something not everyone agrees on and may not be good practice. x is even is intuitive syntax. GOTOs are intuitive control flow. We’ve mostly moved on from both English-like programming languages (not counting genAI prompts) and GOTOs for many good reasons.

I keep seeing traits/interfaces get mentioned as a handwaved solution to unenforced properties, and they’re just not that powerful. They can’t guarantee those properties at runtime because it’s trivial for a developer to declare an interface then implement it incorrectly; an Add interface could guarantee that a + method exists, but I could write throw("not implemented yet") as a stand-in to interactively try out other methods and completely forget about it. Interfaces do help communicate the presumption of such properties to programmers that implement them and to the program that dispatches to certain methods, and docstrings do the former. Even if it’s not intuitive, a clearly stated property being implemented incorrectly or ignored is ultimately the fault of the programmer, the same way using in as a general subsequence test in Python is the fault of the user.

To address + commutativity, there’s probably nothing that can be done besides proper implementation. If we had different input types, then maybe some sort of multiple-dispatch version of an interface could pair every +(x::T1, y::T2) with a +(y::T2, x::T1) = x+y. But we usually implement addition for the same types and use promotion to reach those, and there’s nothing stopping us from implementing non-commutative addition and just declaring Commutativity(::typeof(+), ::T1, ::T1) = Commutative().

My very personal opinion is I’m reading an already very chunky manual to learn how to use the language, and any historical explanations or apologia is ideally limited to a couple sentences. I’d actually be fine with that sentence, it’s the following paragraphs that seems like math major filler. And as we’ve already demonstrated, it’s easy to just argue with more reasons. I’d rather see that one throwaway sentence in the FAQ or “Differences from other languages”, or maybe just use a + or function name so it’s not even a question in my mind.

Bit of an aside, but something similar is needing an entire section to justify the existence of hard/soft scope. It adequately explains why it turned out this way, but I think it would be a lot better for newcomers to just bite the bullet and accept writing global in for loops because not polluting the global scope is a good thing, especially in a language that grants us the convenience of splitting global scopes into several files.

3 Likes

The key thing about InterfaceSpecs.jl is that commutativity is not just declared but proved:

struct Commutativity{T, Op}; end
function (::Commutativity{T, Op})() where {T, Op}
    forall(NTuple{2, T}) do (a, b)
        forall(Op) do op
            op(a, b) == op(b, a)
        end
    end
end

julia> exists(Fact{Commutativity{Int, +}}) # Request the system to attempt to prove commutativity of +(::Int, ::Int)
#= Fact object with optional proof or error =#

It’s something you use at test time rather than run time but still.

2 Likes

I was distinguishing compile time vs run time, so test time would lean towards run time, but then again, compile time is relatively less distinct from run time for Julia, and we don’t even compile everything that is defined. I’m sure I mean something reasonable, but I don’t know the word. Static?

It’s worth pointing out that exhaustive proof (related to formal verification?) is not possible or feasible for many types. It’s possible to test a critical subset of a type that is proven to generalize to all other instances, but that’s not easy to determine. It’s also possible to design the test poorly e.g. misimplementing ==(::T, ::T). It’s not always even necessary; we can’t exhaustively test ==(::Vector{T}, ::Vector{T}) for every instance, but static analysis could successfully simplify the situation to an exhaustive test of ==(::T, ::T), which could be feasible. Among other reasons, testing properties isn’t often a builtin part of interfaces, unless we count the existence of method signatures as properties.

1 Like

In Julia, operators are functions, and the same rules apply. That is, if you want to write generic code using some operators, they should follow a consistent interface within the class of types you are considering.

This means that using * as a product on numbers, matrices, and friends is a good design choice. This does not mean that you cannot use * for other stuff, like string concatenation. You absolutely can, if either

  1. you are pretty sure that you will never be writing a method that is expected to work in both type spaces, or

  2. they follow the same abstraction, so it does not matter.

For strings, I think that the reality is (1) for most people, and they have a beef with the monoid story because it tries to make it sound like (2) applies. I think that Julia code where * (or prod) etc could apply to either strings or numbers is highly unusual.

Is this actually working code, or a desired outcome? I could not even get GitHub - Keno/InterfaceSpecs.jl: Playground for formal specifications of interfaces in Julia to load on 1.10 or 1.11, so I could not try.

3 Likes

It’s worth mentioning that the developers of the function, including operators, get to decide what the purpose of that function is, and they alone can add different purposes for specific types without violating their design. getindex was mentioned to weirdly instantiate arrays by concatenating input values, and while that is a very unfitting name, it is documented. Types aren’t supposed to be containers with indices, so I guess there’s no risk of the two very different purposes overlapping. On the other hand, if I were to extend Base.getindex that didn’t implement either purpose (maybe I’m even concatenating custom strings), I’d be bucking the design in a very risky way that approaches type piracy. It’d be proper for me to just define my own function MyModule.getindex.

This does illustrate the risk of making functions have multiple purposes: their callers should make the distinction e.g. sum(list("python"), start="") errors in Python and instructs users to call "".join(list("python")) instead. prod is documented in Julia for products/multiplication, and concatenation isn’t really considered either of those things. prod(split("julia", "")) also only works because one(::AbstractString) is implemented, and one is only documented as the identity element for multiplication, not concatenation. It appears as if someone thinks concatenation is multiplication, or that they just plain forgot to mention it in functions that use *. Granted, it may be difficult to make this distinction across *-callers, which would’ve been a point in favor of a different function, which we do have in Base.string. At least one person in this thread was on Team Named Functions.

Yes, I know that is the case in Julia now. I’m lightly arguing, because we are in silly speculation and theory land here :slightly_smiling_face:, it doesn’t really make sense to treat operators that way. As I showed, Julia itself repurposes a syntax to do wildly different things.

My argument is that operators aren’t a good choice for code that must be fully generic, because symbols as humans use them are often highly contextual in their meaning, and that could, or could have been, designed for.

1 Like

I think that applies to any function. If it’s strange that prod uses * even when it means concatenation instead of multiplication, then it’s just as strange if I make an alias const multiply = * and use it in a function product that doesn’t disallow concatenation either.

If you’re referring to prefixed brackets A[i], then that practice predates Julia. In C, T A[size]; instantiates an uninitialized, inline, and statically sized array, and it’s indexed with A[i] for either getting or setting. Still, it’d be simpler to properly describe caller functions if 1 syntax or name had 1 purpose.

It’s idiomatic to write A[i] instead of getindex(A, i) for the purpose of indexing, so lowering* to prefixed_bracket(A, i...) = A isa Type ? typed_vect(A, i...) : getindex(A, i...) wouldn’t actually help narrow down the purpose. For example, getindex3(A) = A[3] would still do the wrong thing for A::Type; the saving grace is that other indexing functions like eachindex would fail for A::Type. The only way to syntactically distinguish purposes would be the unidiomatic usage of typed_vect and getindex. Using the same syntax for ElementType[elements...] and Indexable[indices...] is the irreversible cause of syntactic ambiguity in v1, so the only solution is significantly different syntax in v2. A possibility is reintroducing curly braces for what would become a type parameter anyway {ElementType}[elements...].

*Also worth pointing out that this branch is complicated by the several ways bracket syntax actually lowers for the sake of array literals. At the moment (and hopefully forever), indexing is restricted to comma-delimited sequences to reach getindex, and typed_hcat(::AbstractArray,...) is actually implemented to throw an ArgumentError for people who unintentionally reach it with parser-unfriendly spacing like A[offset +2]. Again, {ElementType}[elements...] could’ve done all these separately from Indexable[indices...], which could strictly lower to getindex.

julia> Meta.@lower A[1,2]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─ %1 = A
│   %2 = Base.getindex(%1, 1, 2)
└──      return %2
))))

julia> Meta.@lower A[1 2]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─ %1 = A
│   %2 = Base.typed_hcat(%1, 1, 2)
└──      return %2
))))

julia> Meta.@lower A[1
                     2]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─ %1 = A
│   %2 = Base.typed_vcat(%1, 1, 2)
└──      return %2
))))

julia> Meta.@lower A[1 0
                     0 2]
:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope`
1 ─ %1 = A
│   %2 = Core.tuple(2, 2)
│   %3 = Base.typed_hvcat(%1, %2, 1, 0, 0, 2)
└──      return %3
))))

In trying to answer your question, I realized there was imprecision in my thinking. To say things more correctly:

The same reason that string concatenation and matrix multiplication overloading the same function doesn’t cause problems is the extremely low probability that string types and AbstractArray{<:Number} go through the same code paths.

Likewise, substrings and strings are unlikely to flow through the same code paths as elements and containers. So using in for both kinds of containment check would introduce an extremely low chance of generic code hitting issues of unexpected behavior.

This is exactly the beauty of multiple dispatch to me.

in(y, xs) = any(==(y), xs)

in(y::AbstractString, xs::AbstractString) = any(==(y), substrings(xs))

Now, is “extremely low” good enough? Maybe not. But, to my eye, it would then follow that we should not have any infix operator for string concatenation since this is not actually the same operation as *.

2 Likes

Yes, it is. Especially for Julia, which is a practical programming language, not an exercise in theoretical purity.

2 Likes

I anticipate a diversity of opinion on whether or not theoretical inconsistency for the sake of intuitive (for some definition of intuitive) syntax is a practical choice :man_shrugging:

Folk theorem: you can strive for consistency, but beyond a certain level of complexity, it is going to be so costly as to be impractical.

Consider string concatenation. Julia could have

  1. gone for +, which, as explained many times, is not necessarily more consistent than * from a theoretical perspective,
  2. have a completely separate operator for string concatenation, which just does that; the problem is that there are very few basic ASCII options, and you would have to sacrifice one, or make a unicode operator which is harder to type,
  3. have a general concatenation operator, eg for iterables which satisfy some kind of compatibility criterion, same issue arises with naming it,
  4. not have an operator for string concantenation whatsoever, just a function,
  5. the same for (3).

I probably missed something. But neither of these is perfect either, and for each one there would be people who have an issue with it.

FWIW, I think that in these kind of situations it is best to just pick one and move on.

There are no perfect languages, just good ones.

8 Likes

In my view, the reason it’s valuable for f(_) to have a generic interface is so that when you write or read f(x) you can infer the properties of the computation that are relevant for what you’re doing. The low collision risk is a positive side effect of having semantics, not itself the reason that having semantics is good.

I think of programs as implicitly passing proofs around. ys = sort(xs) implicitly produces a proof that ys has the same multiset of values as xs and that ys is increasing with respect to its indices. That proof is not checked computationally but in the author’s head, and the caller of sort(xs) relies on sort(_)'s author to have a good proof.

In Julia, types are dynamic objects and we don’t generally have static proofs about what the typeof of an object is. Proofs that depend on the typeof of objects would generally be too complex and onerous for humans to manage—see the lengthy example below.

A lengthy demonstration. (Click to expand.)

If f(x::T) and f(y::U) have inconsistent contracts, then in order to know what f(x::T) does you need to know either (1) that x has type T and (2) the contract for f(::T). That’s fine, you might say, I do know that x has type T, so it’s obvious that I’m going to get the f(::T) behavior.

Unfortunately now we have eliminated generic programming: f(_) is undefined. What if I want to get the f(::T) behavior but on a value z::Z where Z is not a subtype of T. We can’t get that, because that behavior is only defined for values of type T, and we have a value of type Z.

What we want of course is an unrestricted signature: f(_) that has the good behavior. So we define that. Now we can get our desired behavior by calling f(z), and indeed we can get that behavior on any value just by calling f(_), unless our value has type U, since f(::U) is defined inconsistently with respect to f(_).

Now suppose we’re writing a library function g(_). g(_) wants to use the f(_) behavior. Specifically, g(z) calls f(z), and relies on the property Pf that f(z) produces for generic non-U-type values of z. What are the responsibilities of g(_) for producing its own contract properties Pg? In order to be sure that g(z) satisfies Pg, it needs to be sure that f(z) satisfies Pf. But the caller of g(_) has no idea that g(_) requires Pf which in turn requires that f(_) take a non-U argument.

g can't just exclude U.

So for g(_) to ensure its own contract Pg, it has to communicate to its caller that g(_) must not be called with a U-type argument. Otherwise it can’t count on f(z) to produce the necessary correctness condition Pf.

That’s not ideal, but it’s manageable—U is just one type, and all other types will produce the necessary condition and ensure g’s correctness. So all g has to do is put in its own documentation that it shouldn’t be called with arguments of type U. The potential caller of g(_) will see that they shouldn’t call it with u because they

Now along comes someone else who defines a new type H and customizes f(::H) to another meaning inconsistent with f(_). Now g(h) has no documentation indicating this will be a problem, since h doesn’t have type U, so the caller of g(h) expects the property Pg. But Pg depends on Pf, and Pf is only produced by the generic f(_) contract of f, which f(::H) is inconsistent with. Since f(h) doesn’t produce its correctness property Pf, g(h) doesn’t produce its correctness property Pg, and g(h) cannot satisfy the generic contract g(_), even though g(_) only documented that it requires non-U arguments—what g(_) actually requires is arguments for which f(_) produces Pf, not just non-U arguments.

In order for g(_) to ensure its own correctness property Pg, it needs to ensure its internal f(_) call produces Pf, like the generic f(_) does. There is no test that g(z) can use to determine whether f(z) is in that category or not. g(z) cannot test that statically with a type check, because the set of types on which f(_) produces Pf can shrink at run time if someone defines a new inconsistent f(::H). g(z) cannot test for Pf at run time because f(_)'s generic contract of producing Pf is only documented, not encoded computationally.

So what can g(_) do to ensure it produces Pg? All it can do is document that its callers should only call it with values z for which f(z) produces the property Pf.

But this problem is recursive: now functions in general have to document their callees contracts, or they can’t ensure their own correctness. If q(z) calls g(z) then q(z) also has to document f(z) must produce Pf.

So in my view the issue with inconsistent overloads is not about the probability of a collision—it’s about what it takes to make a correctness argument. Unless generic functions have meanings, it’s too hard to say what a program means, let alone that it does it correctly.

Instead of depending on the specific types of our objects, then, Julia correctness arguments depend on the (sometimes imagined, but ideally shared) generic specifications of the functions.

In 2020, there was a discussion about adding to the merge(dict1, dict2, dict3, ...) another signature merge(::Function, dict1, dict2, dict3, ...) that would use the function argument to merge the dicts into a single dict. Julia came to in my view the right decision not to do that, but instead to introduce a new function mergewith(f, dicts...) such that each function’s semantics is invariant to the types of its arguments.

@jeff.bezanson explained his reasoning:

With multiple dispatch, just as the meaning of a function is important to get consistent, it’s also important to get the meanings of the argument slots consistent when possible. It’s not ideal for a function to treat an argument in a fundamentally different way based on its type: e.g. “if this is a Function, then call it on pairs of values, otherwise assume it’s a container providing the values to use”. Given the meaning of merge , that’s not something it would naturally do when passed a function (it would probably be an error, which is a very useful default for catching bugs).

Bingo. For me, the point of multiple dispatch is so that each type can implement the same contract specification in a way specialized to that type. It is not so that a single function can implement different contracts for different types. It’s far better to use a different function—it can even use the same name (symbol), as long as it’s in a different namespace. That way each function f(_) can keep a single meaning. That meaning, or contract, can be specialized to add more guarantees, more performance, or a custom implementation for specific argument types (fancy combinations of arrays, say)—but never to take away guarantees. This way we can understand the meaning and correctness arguments of our programs.

5 Likes

That would be very nice! I’m actually surprised Julia still doesn’t have a concatenation operator/function, whatever it’s called.

3 Likes