Assignment and argument passing semantics

This is a short summary based on what I understood from documentation, and this long thread , without mixing in (much of) scope rules, or any implementation details.
I will appreciate if somebody could check it’s validity.

Given that expr is either an expression evaluating to (returning) an object, or a literal object, either mutable or immutable, and not just a name ;
and
f(y) = do_with(y)
then:

  1. x= expr makes the name/variable x bind to a new object (with new “reference” hence new identity). (even though “reference” can’t be retrieved in Julia for immutable objects)
  2. both y=x , and f(x) , make y bind to the same object as x is bound to (with same “reference” hence same identity)
    • “reference” is a semantics-level concept that ensures that when 2 variables have same reference to an object (thus “share” the same object), and the object is mutable, then mutations produced via one variable are going to be seen when the object is retrieved via the other variable.
  3. The above about y=x is true not only when x and y are in same scope, but also when x -s scope is the parent scope of y 's scope, regardless of x -s scope being local or global

Comment: For the sake of ease of learning/understanding, I wish Julia offered a way to check the “reference” of even immutable objects.

===

2 Likes

Core.:=== — Function.

===(x,y) -> Bool
≡(x,y) -> Bool

Determine whether x and y are identical, in the sense that no program could distinguish them. First the types of x and y are compared. If those are identical, mutable objects are compared by address in memory and immutable objects (such as numbers) are compared by contents at the bit level.

BTW, the above reference to “address in memory” is another reason that led me to conclude, in that very long thread you know of, that, at least in case of 2 mutable objects created when x=y=[0 0], they would reside at same location in memory.

Due to current compiler limitation that I acknowedged in that very reply, yes they will. For generic mutable object if they actually have addresses (which they don’t have to) then they likely will but also don’t have to.

Edit: and yes, that doc isn’t really accurate since address doesn’t really exist at this level and optimization can easily break what it says. It is, though, relatively unambiguous when limitting the discussion to mutable object only. Trying to extend it to immutable objects is just plain wrong.

And I don’t understand why you are quoting the doc.

2 Likes

Oh, because you suggested I could use === to check the reference of even immutable objects. But docs say (to me) that it would not check the reference, but just the contents

The content IS the reference.

It IS checking if the objects are identical, i.e. if the inputs are the same object.

If you mean whether they end up being in the same address, that’s a invalid question to ask so there’s no way to answer.

1 Like

I don’t understand what kind of concept of reference is that, if “the content is the reference”. And I don’t mean here whether the 2 immutables are at same address in memory or not – I got that, thanks to you. (Same for mutables).

I mean, in case of mutables, “reference” seems to be a sort of an ID for the object. Even if the object is mutated, the reference stays the same, hence it’s still “same” object; while the contents of the object obviously change.
Now if you say that “content” is the reference – I guess you mean that’s only true for immutables – that breaks the concept of “reference” I describe above.

Also, it seems to break this mental model described in that long thread.
(And it’s mental model,so no worry I don’t interpret it literally )

EDIT: when I said

I meant that, for mutable objects, we have:
pointer_from_objref(x)
to check the “reference” of the object (correct me if that’s not the reference talked about in this post)
While for immutables, we don’t have such a thing.

MORE TO THE POINT:
Suppose x = [0 0]
And suppose we add to Julia language a special keyword to make this mutable object impossible to mutate in current scope, but still otherwise stay array, like:
immut x
Then I will still be able to check and confirm its unique reference with pointer_from_objref(x) , and still the concept of “reference” as ID holds.
However, now this x by all purposes would behave as a immutable object!

That’s whyI wish we had for immutables, a way to check and ID to confirm uniqueness, so that the semantics would seem more consistent (and easier to understand)

It’s not the concept (or definition) of reference, it’s the property of the reference or if you want, the implementation of reference.

Same for immutable.

… Well, yes because I’m talking in the same context as your,

That “mental model” is simply replacing reference with pointer and object with memory even though the pointer and memory doesn’t correspond to any physical memory or pointers. If you are talking in the scope of that “mental model”, then yes, === is your answer. If you are talking about it in the scope of the docstring of ===, i.e. runtime implementation level, then it’s just incompatible with the “mental model”. That’s why I don’t want to talk about memory unless you are talking about implementation since that leads to invalid question like this.

Again, that’s not a valid question to ask. That function doesn’t even actually return what you may think it does for mutable objects. It returns an address that can be read or mutated with the same effect as if you mutate the object directly. It’s by no mean where the “object” is (some other part of the object might be optimized out, i.e. it’s invalid to use that result as jl_value_t*). With this definition, it is possible to ask for the same for immutable objects as long as you don’t want to write to the pointer. However, that doesn’t tell you at all where the immutable object is “originally” stored since that’s still just an invalid question to ask.

I added an important EDIT to my above reply, explaing my point better

Well, the assumption is just invalid. You are significantly changing the language by adding something so fundamental like that.

So we don’t have what you assumed and it’s an invalid question to ask for the address of an immutable.

Can we please not have another long, contentious Vic versus Yichao showdown?

The x === y operator checks object identity: i.e. whether two objects or not there is a legal Julia program that could distinguish between x and y. For mutable objects this checks that they are in fact the same object at the same location in memory; for immutable objects, it checks that they have the same value.

The objectid function does not give a memory address for all objects but it does give an opaque UInt hash value that is compatible with ===: i.e. if objectid(x) == objectid(y) then with very high probability x === y. This hash value is based on memory address of mutable objects and content of immutable objects.

You really don’t want that—no matter how much you may think you do. It’s also unclear how it would even be implemented. What is the memory address of an immutable value that never exists at all? If you can to ask for the memory address of any object then all objects are forced to live in memory, which is terrible for performance.

There is one immutable type whose memory location you can ask for—String. Example:

julia> a = "Hello"
"Hello"

julia> b = "Hello"
"Hello"

julia> a === b
true

julia> pointer(a)
Ptr{UInt8} @0x000000010f65ac58

julia> pointer(b)
Ptr{UInt8} @0x000000010f669ef8

However, the API must be understood this way: when you say pointer(s) you are asking for some pointer to the data in the string s but you cannot be guaranteed which one you will get. We could change the implementation in the future so that short strings are interned and these two string instances would have the same pointer value.

3 Likes

Thank you a lot for your clarifications.
For the sake of other users who will read this topic – is the original post correct in all details?

Yes, I understand now that asking for an implementation for a true reference of an immutable is just too prohibitive - thanks.

My last question to clarify the semantics, and mental model, in this topic would be this:
1

x= [ 0 0]
y=x
x=[0 0] 

2

x = 0
y=x
x = 0

Question: Are both of these cases understood (semantically) as y binding to same object (same reference and same identity) as 1stx, and 2ndx binding to a different object (different reference and identity) than the 1stx ?
Note that in both cases, the contents of the Ist and IInd x-s are the same.

Yes, apart from the last one that I replied to.

No. The second x = 0 binds to the same object as the first x = 0.

You can think about it that way if it makes sense to you and in practice two different uses of the same integer value may or may not be stored in different places. Since integers are immutable you can’t tell.

The literal syntax [0, 1] creates a different array object each time it is evaluated—in a way that is observable and guaranteed. The literal syntax 1 conjures an indistinguishable integer object every time. Internally, it may be a different copy of 1 but there’s no way to tell them apart so they are all “the same object” for all meaningful purposes.

5 Likes

Please don’t bring up implementation again after he finally can talk correctly and strictly about semantics. It’s’ wrong to think of there are multiple 0 or 1 objects not just because you can’t tell, but because that’s the definition. The definition is based on the property that you can’t tell the difference of course but the final logic is still based on the definition.

FWIW, if the implementation or at least the ABI is included, you can absolutely tell that the 1 in x = Ref(1) and the 1 in a y = [1] on the next line are actually stored in different memory addresses, there are many well defined way to observe that, but that doesn’t make the two 1s different object.

1 Like

If it’s easier for someone to think about it as if literal expressions always create new objects, that’s a perfectly valid point of view—as long as it goes with the understanding that Julia does not consider different copies of the same immutable value to be “different objects”. In practice, they may or may not be the same actual object—and it doesn’t matter since by definition objects that are === are not distinguishable.

4 Likes

I don’t agree; but:

Can we please not have another long, contentious Vic versus Yichao showdown?

Thanks for checking.

The only reason for that is that I understood something about implementation…

The definition could be changed… (I’ll explain what I mean in another reply or post)

No. It’s just a view that doesn’t conflict with the reality in some limited situation. But as all the mistakes in the discussion clearly showed, thinking this way and taking it further leads to invalid conclusions and questions. In another word, if you just think like this in this very example, that’s fine. Don’t apply it to anything else.

Yes, sure. You are clearly thinking about a different language, in which case everything can be change of course. If you actually want to understand the current version of the language though, no.

Limited how? It’s a valid viewpoint that’s consistent with Julia’s semantics. The reality is somewhere in between “all 1s are the same” and “all 1s are equivalent copies”: some instances of the expression 1 will refer to a shared copy of the object 1 (via the Int box cache), others will refer to separate copies (in registers or arrays), still others will never refer to any object at all (optimized out). The whole point of having semantics as an abstraction is that it doesn’t matter, so if someone wants to think of things in terms of one potential implementation of the semantics, they’re free to do so.

Because if it’s a new object but you can’t tell with === then it’ll be a valid question to ask, for example, where/what is that object, and what do you even mean by it’s a new one. But these questions just don’t make sense. It also makes indistuiguishability from === sounds like an implementation limitation, but it’s not. Both of these are mentioned in this exact thread and I fully understand and agree that with your version of the definition they are very legit questions. The questions are not valid though, and that’s because the understanding they are based on, is wrong.

FWIW, it’s not even a consistent view to say something is different but indistinguishable. I use “different” here since I assume that’s what “create new objects” means and I don’t see how can you “create new object that are actually the same one”. It’s limited in this sense because you cannot apply it to anything. The “new object” has zero significance. You actually make me wonder what exactly do you mean by “new object”, what’s the point of introducing a concept that has all the properties indicating that it doesn’t exist.

It also really gets in the way of understanding anything deeper. The potential “new object” does exist in the implementation, but it exists as much in literals (i.e. x = 0) as in any assignments including y = x so for consistency you should really say that y = x creates a copy for immutable too. That can easily lead to questions of what is that “copy” and why is it not overloadable. This hasn’t been mentioned in the thread yet but I can fully understand why someone will ask about these if with your interpretation especially if they come from C++ background.

Python is very unhelpful here either and I remember having someone asking exactly about this (don’t remember the keyword to search). Python has this exact problem that literals actually might create new object so saying that literals creates new object will be really confusing for people from python. Specifying that the same immutable object are always the same one is a much more clear distinction from python’s behavior.

And if they actually don’t want to understand anything of the real implementation, sure. Trying to understand one implementation from another is much harder and more confusing proven by the number of wrong statement in these threads.

3 Likes