Explicit denotation of variability and immutability

Please don’t take this the wrong way, but being unable to implement a fundamental change to a language is usually a good indicator that you may need more experience as a programmer before redesigning said programming language.

A necessary (but not sufficient) condition for devs to entertain such a fundamental change would be a pull request implementing it. This would make the author work through the parser, then the 100KLOC of Base, and experience the costs firsthand. This is much more convincing than arguing in the abstract that the extra syntax burden imposed on other programmers is negligible or worthwhile.

8 Likes

Thanks. Very good point and fair.

However, I believe abstract reflection/argument about the fundamentals of Julia should be promoted, not quelled.
IMO, some comments here only push for a conformist approach of get-used-to-status-quo or shut-up-and-code. There is always a tendency among core developers to stick to their own laid-down “traditions” – and that’s doubtlessly plausible. Yet sometimes there is a need for a change, like what was done about user-defined types (the keyword also changed to struct), and such a change is always difficult due to the inherent inertia (eg., in 100KLOC of Base).

Luckily, so far, this proposal has obtained some attention. I hope it will catch the eyes of a developer who could help me produce a proper pull request.

4 Likes

FWIW (not a core dev), I think the way julia handles this is extremely clear and explicit and I don’t understand your problem, at all; and I dislike your proposal even if there was a blank slate to start over.

Mutable means object on the heap; always passed by value-of-pointer; obviously modifications are visible. Immutable means [*] object on the stack / in registers, passed by value, i.e. a logical slot for so and so many bytes of local state (where obviously mutable counts as a pointer aka 8 bytes). Assignment always changes which slot (SSA value) your name is bound to; a.z=b means setfield! which is a built-in function-call for modifying memory in the heap via a mutable (aka pointer), as is setindex! aka x[i]=b. This is all mutation that exists in julia, and they can be called on mutable objects only [**].

The compiler re-uses local memory for SSA values that it can prove will never be used again (e.g. because all bindings to them were re-bound or went out of scope) or sometimes does not need a stack slot at all (use a register). The gc cleans heap objects that it can prove will never be used again (e.g. because no tracked pointer to them is still living in a living stack slot, nor in any heap object).

[*] Due to compiler limitations many immutables are still heap-allocated. Also I only understand the julia-native ABI up to the rather short documentation and a little experimentation, not meticulous source-code reading (and I may have failed at RTFM on the way!). Also, it might be that 0.7 sometimes manages to put mutables (aka syntactic sugar for Ref{immutable_struct}) on the stack.

I think the native ABI is not intended as a spec; that is, native-ABI breaking changes are not considered breaking, and the goal is “every julia session produces machine-code that is interoperable within the same session”; people are just not supposed to code against the native ABI. (If my read on this is wrong, I’d be happy to be corrected on this! Also, this is not a critique; this is a perfectly sensible position to take for an internal-only ABI.)

[**] Of course you can define dispatches for immutables as well, like e.g. Base.setindex!(ptr::Ptr{T}, val::T)=unsafe_store!(ptr, val), and in 0.7 you could define Base.setfield! for pointers to structs.

As far as I understood, it is a deliberate decision by the core devs to not provide such a convenient syntax for pointers, in order to discourage their use-- part of their job is not just developing a language but also nudging towards a certain “idiomatic” coding style in the ecosystem (obviously I personally disagree, I’d prefer more convenient pointers; will maybe find (or write) a little package once I switch to 0.7).

I agree with them in that, as long as the functionality is still available (in a performant way).

I ended up making a set of functions for my packages, to access strings and iterate over strings without overhead, get_codeunit and set_codeunit!, that take either a pointer, vector or string, and optionally an offset (1-based, 'cause Julia just prefers things that way :grinning:). That way, I was also able to extend it to handle efficiently “wrapped” pointer types, that indicate if code unit pointed to is possibly unaligned, or byte swapped, so I can directly handle things like UTF-16BE encoding on a little-endian system, without having to allocate a buffer and byteswap the whole thing first.

I think this point was mentioned also by @StefanKarpinski here. This is where we clearly disagree.

As I said, I am pushing for a syntax which explicitly denotes the programmers intention for mutation of bindings/values in any scope, and reserving = for definition/equality as in mathematical notation. My proposal could be deemed simply as a generalization of the current convention of adding a ! to the names of the functions which mutate their arguments or the decision to make immutable structs the default – no attempt to change the underlying principles of Julia, but generalizing them. What happens under the hood is a secondary issue here – though your explanation is indeed appreciated.

It’s not just a generalization though. Just a generalization would imply everything is the same, but now there’s some extra. This imposes a syntactic constraint on existing use cases, so current codes could fail (and in fact, many generic codes would be almost unfixable without dispatch on mutability).

2 Likes

Again an important point; thanks for the note.
Could you bring a simple example of unfixable code?

Note that I am happy with warnings, not to break previous code, so that we just promote a better explicit style.

There’s been examples in this thread of things which would need dispatch on mutability.

If you mean this

then let me first re-write the proposed code with more explicit names:

struct imutType{T}
    x::T
end

mutable struct mutType{T}
    x::T
end

# pure g
g(x::imutType) = imutType(3x.x)

# impure g
function g(x::mutType)
    x.x *= 3
    x
end

# f can be pure or impure depending on the type of x
function f(x)
    g(x)
end


imut1 = imutType(9.2)
# imutType{Float64}(9.2)

mut1 = mutType(9.2)
# mutType{Float64}(9.2)

f(imut1)
# imutType{Float64}(27.599999999999998)

f(mut1)
# mutType{Float64}(27.599999999999998)

println(imut1)
# imutType{Float64}(9.2)

println(mut1)
# mutType{Float64}(27.599999999999998)

This is a bad style of coding, imo, since reasoning about the behaviour of f is rather difficult, as I said – esp. when g(x) is buried in a long body of f; for instance, shall we add ! to the name of f or not, according to current convention? Honestly, I cannot the purpose for such a code .

At any case, according to my proposal, the previous code produces warnings (eg., deprecation warnings), and the recommended definitions for f and g is

# pure g
g(x::imutType) = imutType(3x.x)

# impure g
function g(mut x::mutType)
    x.x *= 3
    x
end

# f can be pure or impure depending on the type of
# x with which it is called; so, in general, it can
# mutate x and therefore, it is impure
function f(mut x)
    g(x)
end

So, it becomes explicit that one intends to mutate x in some cases.
Note that warnings is enough to promote a better style for the future. No previous code will break in this way.

I think you’re failing to gain as much traction here as you expected because mutable is already explicitly denoted everywhere (With a type declared as mutable, all uses of that type are mutable. With a type not declared mutable, uses must be wrapped in a mutable container. As Stefan mentioned initially above, Ref with #11902 / #21912 would be a straightforward way to do this).

But you’re also mixing a lot of different concepts here. And importantly, these are all independent choices.

  • const bindings (e.g. enforce that each variable appears at most once on the left-hand side)
  • subtype relation between the mutable and immutable version of a type
  • pass-by-binding calling convention / first-class bindings
  • distinction between mutation of a struct vs “contents” (e.g. Set vs. ImmutableSet)
  • syntax for above

(not intended as a complete list, but I hope I managed to summarize the highlights of the OP)

  • const declaration: this is simply a syntactic property of a binding. You’re welcome to write a syntax error for this (it would live in src/julia-syntax.scm), currently it’s reserved syntax for this purpose. We have no intention of making this the default, as it does not seem to be a common mistake, but can greatly increases the inconvenience of the language (the cost / benefit ratio does not appear favorable). We don’t have this now primarily because nobody has made a PR to implement it.
  • subtyping: there’s no clear direction here. There’s fewer restrictions on mutable, so that would imply immutable <: mutable, but anywhere that’s valid to pass an immutable, it’s also valid to pass a mutable, so that would imply mutable <: immutable. This is actually precisely the reason that Julia does not have a covariant type system: S{T} is simply disjoint from S{Any}.
  • pass-by-binding: Julia does not have first-class bindings (e.g. & in C++). Bindings are different than values in Julia, and will remain so. Proposals to change this, or to confuse the two, will get nowhere.
  • mutation of a type vs. the struct: In some cases, the value of a and the contents of a are indistinguishable concepts (perhaps Complex, for example). However, more commonly, you might have an immutable wrapper around a mutable object. Thus, the contents of many types are mutable, while the type itself is immutable. Indeed also, the implementation of the immutable type might look very different from the implementation of the mutable version of that type (consider perhaps Set, File, IOBuffer, Adjoint, for some examples). Thus it often simply not useful to have syntax for generically declaring a wrapper as being mutable-or-not, since it would still fail to express the intended meaning (“can’t be changed”). Indeed, that meaning (what “can’t be changed”?) itself may not be well-defined (anti-dwim: “don’t do what I don’t mean”).
  • syntax: adding syntax is a trade-off between making code require more/less key presses to type and making the code harder/easier to read and more/less obvious to an inexperienced user. But it’s typically not a simple relation in either direction. I think the optimal strategy is something along the lines of “use the minimal amount of operators and syntax, but not less”.
13 Likes

Mutable/immutable has no such meaning. Those observations are merely examples of the possible results of applying various optimizations. The words themselves just mean exactly what they say: “mutable” objects can be changed (mutated), and those changes can be observed through any other reference to them; “immutable” objects cannot be changed, and thus any copy is indistinguishable from any other copy.

4 Likes

It would generally be even more useless to our compiler as the const declaration is in C / C++ (which, awkwardly, can dispatch on it, even while it can’t optimize based on it).

Yes, the use-case would be obvious, but the definition (and usage) of “does not alias” may not be so simple (ref RFC: Macro for expression noalias hints · Issue #19658 · JuliaLang/julia · GitHub and https://github.com/JuliaLang/julia/pull/25890).

It seems like this might require defining and preserving “scope” in Julia at a much lower level. I agree that it would be nice to improve the compiler in this regard. I don’t think we want to be disabling gc-safety though, unless there’s a very strong case for it.

That’s always true (and already made use of by the compiler), so unnecessary to annotate.

I’d like that also!

It appears that you mentally view julia as a high-level interface to a “julia virtual machine”, whereas I (not coming from the programming languages community) always look at it as a high-level interface to x86 (to be honest; still working at better reading llvm; and I have no clue about GPU internals).

Your viewpoint appears useful; can you give me tips where to look in order to better understand the “julia virtual machine”? Is this the viewpoint exactly reproduced by @code_lowered? Is there a language spec for this? In the sense: I would not care about java language specs, I’d care about JVM specs and try to think in bytecode.

Or is this something so informal, implementation-dependent and fast-changing that it makes no sense to learn it, unless I want to contribute to the compiler?

1 Like

Thanks for the comprehensive answer.
Does your critique apply also to my Codicil or just the original post?
I have tried to correct some of my misunderstandings according to the earlier comments by Stefan.

I was thinking about marking the argument as readonly, as suggested in https://llvm.org/docs/LangRef.html#id840; alas, https://llvm.org/docs/LangRef.html#id834 suggests that this feature currently doesn’t exist and either I or the llvm doc is confused. I was never suggesting to dispatch on it (its an attribute on an argument set by the callee). Regardless, either it doesn’t work or you know better anyway :wink:

Simplest case: Function boundary? I ask julia to give me a non-bitstype, but I promise that it will be unreachable by the time I return (which also means that I cannot return it)? I mark a non-bitstype and promise to not leak additional references that survive my return? And yes, if I pass it down to a function I call then I better be sure that this one doesn’t store a reference at some place that survives my return (not sure whether this is compatible with the generational gc; does it get grumpy if it finds bad memory that is reachable from an old object that is semantically-unreachable but not reclaimed yet?).

For cases where the alloc or indirection (for a small composite holding a reference to a mutable) is too expensive, the current alternatives are not nice: Either pointers or a really ugly API and code. You could rightfully argue that bad gc-safety should be ugly, though, and an explicit pointer or convoluted memory-safe code is better than some easily overlooked @no_escape; on the other hand, @no_escape has a very attractive upgrade path: simply remove it once the escape analysis improves or something like the all-immutables-on-stack project attempted by carnaval lands.

Thanks for the link; at some point of time I need to get better at finding issues in github.

I found an article related to my proposal:
A. Bauer. “On programming language design” (2009):
< Mathematics and Computation | On programming language design >

Especially relevant is the section, “Confusing definitions and variables”.

Very interesting blog, I agree with a lot of what he said, however, as far as Julia is concerned,
I don’t think there is an issue with definitions vs. variables.
const foo = value is a definition, IIUC.
Yes, the default to not be a definition (const), but I don’t think that’s a big issue, and for a language with a REPL, makes more sense.

I really think, that if you like most of Julia, except for these issues, your best bet would be to use something like JuliaParser.jl as a base to make your own XXXParser.jl which parses the syntax you want, and produces Exprs.
If other people decide that they prefer your approach, then they could use that instead of the standard Julia syntax.
There are a number of languages that work in a similar fashion, for example by transpiling from CoffeeScript or TypeScript to JavaScript, and there is a Clojure syntax package already for Julia.

3 Likes

Although it’s not entirely clear, that section is talking about object state, not local variables. This is made evident by the reference to Java final variables which apply to class members. Immutable object state already is the default in Julia because struct is immutable and one needs to write mutable struct to explicitly request mutability. Nobody – not Java, not C++ – is particularly worried about mutation of local variables… because mutation of local variables is trivial to detect statically (I’m not sure how many times I can say this).

On the whole, this blog post seems to be an essay about how much this guy likes Haskell. Which is cool, but not all that helpful when designing other languages with very different target audiences and design constraints.

3 Likes

The point is that I have not sufficient know-how to devise a new parser (or modify the current one).
Is there a place/person to get help to build up such an experimental Julia parser?
I guess the required changes to implement this proposal are not that much.