Assignment and argument passing semantics

The point of the semantics is to provide a simple definition that is always true that you can always come back to understand the implementation. That’s why the definition of object identity is important and it must be agreed on that all the Int(1)s are the same object (they have the same identity). If your “new object” does not mean different object identity though, then I just see that as either not talking on the right level or is just playing with word (I could be wrong but I really can’t think of any other possibilities). I don’t have any problem talking about literals or assignments creating copies/new objects when we talk about anything implementation related stuff as long as that’s not the abstraction everything is based on.

In short, by

I mean the two has the same object identity and this should be the definition.

It is important, however we must not think that the current definition is best possible - unless you have a proof for that, but I don’t think a proof is possible, since this is about concept and convention. Let’s keep an open mind.

Let’s look again at that definition of === (and not the interpretation):

First the types of x and y are compared. If those are identical, mutable objects are compared by address in memory and immutable objects (such as numbers) are compared by contents at the bit level.

This is all that is clear and unambigous about it. There is actually no requirement to interpret it as giving the definition of “object identity”.

The docs say that if it returns true, then it should be interpreted as “no program could distinguish them” , i.e “programmatic indistinguishability”.

Even this I find a bit innaccurate characterization, because, for example, for strings - immutables - Julia offers not one but 2 ways to tell that "this" and "this" can be distinguished: pointer(s), pointer_from_objref(s).

But even assuming that the above 2 ways for strings are to be regarded innacurate or unreliable – still “programmatic indistinguishability” is all that that === gives us.

The key word here is “programmatic”, meaning : Julia, currently , with the methods it has, it cannot distinguish the 2 objects. If you teach it later (add proper methods) , it may later. That’s okay!

But that does not prove or mean impossible to distinguish at all. Even less it proves anything about “object identity”.
The latter is a concept that we are free to choose, and see if it matches with that “programmatic indistinguishability” or not .

I’m talking about the current language. If you want to make change to the language, sure. Please do clarify which one you are talking about. AFAICT you started asking about the language in it’s current state. The definitions are set in stone for the current version and there’s no way that’ll change (if it’s changed that’s a different version). I’m certainly open to discussion about proposed changes to the language, if that is really what you are talking about.

You are quoting the property/implementation. The definition of === is,

Determine whether x and y are identical

And I think by all mean this is referring to the identity of objects. The next half of the sentence,

in the sense that no program could distinguish them.

Defines object identity for you.

And that’s actually not distinguishing them. As I already said (acknowledging that the doc for pointer_from_objref should probably clarify this better for immutables and that String is the only strange immutable currently), pointer_from_objref (and pointer) actually doesn’t return the “address to the object”, it gives you an address that represent the object. The returned value (for immutables) carries no significance on the object identity, which isn’t defined based on the address (again, for immutables).

And that’s exactly the wrong conclusion from the wrong definition I said above. No, such method can never be added without major redesign of the language. Such a function conflicts fundamentally with the current design, which can be formally described as it conflicts with the definition of object identity.

1 Like

People are certainly free to think what they want, but is it a useful model of how Julia works? Especially for someone who seems to be struggling with understanding the semantics.

One could also imagine that mutable structs exist in 13 memory locations simultaneously, and the implementation takes care to keep them in sync and pretend that there is a single one (when interfacing with C, you always get the 7th, etc). Is this useful for anything, other than an intellectual exercise?

It depends on if it’s easier for you to imagine that each evaluation creates a new object but they are considered equal or if it’s easier for you to imagine that every evaluation of an expression which produces an immutable value somehow conjures up the same platonic instance of it. The beauty of immutables is that it doesn’t matter.

2 Likes

@vic please do read Henry Baker’s “Egal” paper that I have linked to before continuing this line of discussion. I don’t think there’s much point of carrying on until you have.

As I explained before pointer(s) must be understood as giving a pointer to some copy of a string in memory, there is no guarantee about which one. It could always give the same pointer for independent instances of the same string or it could give a different pointer each time you call it on the same string. In other words, pointer must be understood as an impure function which gives different answers for the same string due to its impurity rather than as distinguishing distinct objects.

3 Likes

@StefanKarpinski , yes , I did read that paper: yesterday. Without going into details (would be too long here, better in a diff topic, let me know):
while the authors have a very strong argument for the usefulness of “egal” and it’s superiority over any other object comparison method they looked at, they fail at arguing that this “egal” does indeed match the intuitive and even philosophical notion of " object identity" (and I think they did try to argue that, in some first parts of the paper).

Me too; I don’t think changing a few words in how we interpret things changes the language. I’ll explain in a bit.
The way I see it, for ===,

First the types of x and y are compared. If those are identical, mutable objects are compared by address in memory and immutable objects (such as numbers) are compared by contents at the bit level.

I said “definition” because it gives it’s “specification” (guidelines for implementation) . It is the only thing that precisely defines what === does.

Next, to associate any other name or description (like “same-ness”, “object identity” or “progr. indisting-lity”, etc) to this ===, there are 2 options:

  1. Argue that the meaning of that name is equivalent to, or follows from, the specification of === above.
    • For that, of course, that name/description must have an intrinsic meaning , that would exist even if === did not exist. Otherwise, it’s circular reasoning.
  2. Define that name/description based on the specification of ===.
    • Now here the authors may want to be a bit caregful: if users already have a strong meaning associated with that name/description, there will always be some resistance or friction or confusion, if the intrinsic meaning of that name/description conflicts with the meaning of === as given by the specification.

“Programmatic indistinguishability” is the case of 1. above: provided that indeed Julia programs cannot (in reliable ways ) distinguish 2 objects for which === reports true, then I don’t see any contradiction or problems.

I think: “Object identity” and “same-ness” of 2 objects , on the other hand, is a case of 2. above. And they fall prey to the issue of conflict of meaning.
I can argue that the specification of === actually agrees well with the intrinsic meaning of “object identity” for muttables, but not for immutables (and this half-agreement is what enhances the confusion a bit). That argument again would require a separate topic; for now, take it as just an opinion. (If people want to see that argument, let me know, again in a diff topic).

So, the authors have a choice:

  1. go ahead, like most/all other prog. languages, and say: “this is how WE define object identity or same-ness, and how we are going to talk about it (like it or not).”
  2. don’t define “obj identity” based on === specification. As I explained earlier, Julia (or other langs) don’t have actually to define “object identity” or “sameness” at all. If another name for ===, is needed, pick one as innocent as possible (almost lacking other meanings) like: “more_equal”, “deep_equal”, “equal2”. Your choice.
    Ex: “"this" and "this" are deep_equal. [0 0] and [0 0] are not deep_equal”. If users ask what that means, tell them “programs cannot distinguish them” or just send them to the specification of === I quoted above.

With option 2. , it’s not a new programming language, IMO. It’s just clearer interpretation of the language: avoid confusions, while still staying precise.

That’s not an execuse to ignore the “Determine whether x and y are identical” part. It tells you that === is testing object identity. What follows (the part you quote) is simply the property of === and testing object identity.

No. As I said above, object identity is what === is defined to test. The logic is reversed here.

It was not, it’s the other way around.

This is not right. Unfortunately, the doc isn’t a formal specification of the language, in that it’s impossible to implement a compatible version of julia purely based on the doc. It means that not all the detail of the spec is included in the doc explicitly and formally even though most can be gathered or inferred from various places. In this case, the definition for immutables is explicitly mentioned here,

that value is the identity of a bits type

And a generic version is included in the doc of ===

Determine whether x and y are identical, in the sense that no program could distinguish them.

Update to the doc to be more explicit about anything that you find unclear is always welcome of course.

I will also add that I really don’t know what’s the difference between “Programmatic indistinguishability” and “identity” you are talking about. We are talking about the definition/spec of a programming language and in this context both are describing the identity. I can’t think of any example/reason for the two to be separate concepts. (i.e. I’ve never seen a language where the two are defined differently and I can’t think of a reason to do so either.)

1 Like

From what I start to understand, there can be several semantics (or semantic models) of a language, all supported/satisfied by the same implementation. Such semantics could be called equivalent with each other.
In my first post in this thread, I tried to come with such an equivalent semantics, that seemed to be consistent across both mutables and immutables , thus require less rules to remember.

The simpler the semantics, and the more consistent with our everyday intuition of things, the less questions will be.

If you mean where in memory, then it’s an invalid question to ask: it’s semantically a new object, not (necessarily) implementationally.
Semantically though, one way to answer it: you see it in the code that it’s a new object, and you can distinguish the objects by their position in the code. Just like on paper, this 2 is different in identity from this 2, although in contents they are equal.

FWIW, it’s not even a consistent view to say something is different but indistinguishable.

It is actually.
Here is an example when indistinguishable does not imply same identity:
At 9am you see a car, at some coordinates (x0, y0, z0). Then you go away, come back, and at 10am you see a car, at same coordinates, that to you looks as having equal properties as that at 9am. It’s indistinguishable to you, but you can’t say it’s same car as the first one, or a different car. The identity of the car is not something that is among the properties you look at when you decide it’s indistinguishable.
To say for sure, you need one or both of the folllowing:

  1. to have kept track of the location in space of that car continuously from 9am to 10am
  2. to have access to a special ID of the car, reliable, that is guaranteed to belong only to that car an d no other car, even while all the other visible properties are exactly the same.

And note that those things don’t require discussing mutability or immutability of the car.

for consistency you should really say that y = x creates a copy for immutable too.

No, because will break the symmetry with the case of mutables, and make the mutables’ y=x appear as an exception to the rules (if we had a different operator to use instead of = in case of mutables, it would be a different story). However, your version of semantic model would still be equivalent, in the sense of supported by implementation.

That’s not an execuse to ignore the “Determine whether x and y are identical” part. It tells you that === is testing object identity.[…]

I did not ignore that part at all; but I did not take it as a precise definition, because it doesn’t say what exactly is “object identity”,
On the other hand, the specification that follows (which I quoted many times) is very self-contained, and does not need any reference to “object identity” precisely see what === does.
=== absolutely makes sense as a useful comparison operator even if you remove any mentioning of “object identity”.

Now,

I meant to say: based on === 's specification, so It doesn’t matter actually.
Following 2 are logically equivalent:

  • object identity is defined , for immutables, by contents at bit level, and for mutables, by the address in memory. And === is defined as the equality test of object identity
  • === is defined as the test, for immutables, of contents at bit level, and for immutables, of the address in memory. And object identity is is defined as the property that the === tests the equality of .

And this definition of obj identity is confirmed in doc page you mentioned ( Types · The Julia Language )

Regardless of whether historically def. for === came first, or def for “obj identity” came first,
you end up with same definition for “object identity” and that’s what I meant in my reply above, where I said “object identity” and “sameness” are defined based on the specification of ===.

Thus, with this understanding, everything in my reply above still holds.

No, this just doesn’t make sense. It’s the other way around. The language has a spec and there can be multiple (versions of) implementations of it. The spec is what’s determines what change counts as implementation details and what not. What you are describing is reverse engineering. Again, if your goal is not to learn about julia but to come up with a description of it based on current implementation detail, then please be clear if that’s what you want. The distinction between the two is that breaking the official spec of the language is a breaking change, breaking the implementation detail, and therefore your description of it, isn’t.

Edit: And I think I’m repeating myself again, but I do agree that there can be multiple description of an implementaion. As long as the unofficial ones are not used to determine what could be done/what’s breaking or suggesting that it’s as good as/equivalent to the official one I don’t have any problem with it. Just be careful that if there’s any discrepancy between the two, the unofficial one is wrong.

Thats why I said, “what/where”. If it’s semantically a new object, it must have semantic significance. Otherwise, as mentioned above, you are just playing with words.

That’s not answering the question at all. You are repeating your definition. You still didn’t give the “new object” any significance.

This example doesn’t apply. Again, as I already mentioned above, this is not a discussion of a real world object identification, this is about object identification in a very specific context, the definition of the language. Real world definitions are hopelessly imprecise and full of loopholes, which is what your example is based on. That is not and cannot be the case for a programming language. “Indistuinguishable to you” but can’t say if it’s the same does not exist in programming language AFAIK. (And again, I just don’t know any example and can’t think of any reason to do so.)

That part defines === (testing object identity), the next half defines identity (“no program could distinguish them.”). You never quoted the “programmatically indistinguishability” as the definition of object identity, rather keep differentiating them.

Yes? So I take it as you finally agree that all 1s have the same identity then, since that’s exactly what I quoted on that page.

1 Like

I admit that I don’t know for sure (yet) what exactly is the definition of semantics of a language: @StefanKarpinski with “semantics as an abstraction” here , detached from implementation, sounds to me like a “semantic model” for the language, and maybe there is such a thing called “abstract semantics” vs “implementation semantics” : I’m not sure, I don’t have a formal CS background. Or maybe just semantics of different languages can be characterized as being one more abstract than another.

It’s like when Smalltalk people say that their objects “send” and “receive” messages: a metaphor making their objects look as living things (which probably appealed to their initially targeted audience, children, IIUC). I see that as either semantic model / or an abstract semantics

No, I did not repeat the definition, because I had not given any definition to that "new object " before. Instead, I gave it, informally, with:

It might seem, at first sight, like a childish model (and it’s okay if you don’t like it), but the significance of it is that it brings a language to a really high -level (highest?) semantics, where when programming you think just like a mathematician or physicist think when solving his problem. In the sense that you only think at level of code, not implementation, and this idea may be a key to clearly separate the thinking about problem (reflected in the visible code) from the implementation details.

It’s true that when we, outside programming, solve problems with pen and paper, with
u= [0, 0] and v= [0,0] we don’t explicitly think that “v binds to a NEW object/value than that of u”, however that is implicit in the fact that when we mutate by hand an element of “u” (erase and write in place), we understand that “v” 's value stays unchanged.
Same with u= 0 and v=0.
Another step in this direction would be making the = only mean binding to “new” (in my interpretation above) object, even in case of y=x for mutables x,y. And use a different operator, like <- or whatever, for this use. But this is for a different topic.

I agree that this model feels somewhat detached from what’s really happening under the hood (though that may be a point of it).


Another possible model (probably closer to implementation, but which however requires more rules):

Given that expr is either an expression evaluating to (returning) an object, or a literal object, either mutable or immutable, and not just a name ;
and
f(y) = do_with(y)
then:

  1. In case of mutable (result of ) expr:
    • x= expr makes the name/variable x bind to a new object
    • both y=x , and f(x) , make y bind to the same object as x is bound to
  2. In case of immmutable (result of ) expr:
    • x= expr makes the name/variable x bind to some object (without requiring it to be same or different from a previous instance of that object)
    • both y=x , and f(x) , make y bind to the some object that is equal in contents to the object that x is bound to.

This example doesn’t apply

I wrote it b/c I thought you wanted to understand how indisting.-ty is different, as a general concept, from "same identity.
Of course it doesn’t apply to programming, with the accepted official definition of obj. identity, BUT it illustrates some of the intuitive aspects of “distinguishability” and “object identity” that we have in outside-of-programming world.

One other example, to be more complete:
whenever you consider (say, look at) 2 material objects existing at the same time, that already implies the 2 have different identity, regardless of they having all other attributes exactly equal or not, and again regardless of (im)mutability. They taking different position in space at the same time implies different identity.

True that these examples are not as precise as in progr. language context (and will never be): their point is to show what is the intuition people might already have about general objects, so that to understand why it will conflict with the current definition of "object identity ".
If you care to understand why many might get confused, or feel a bit uneasy about it. (And this is not just specifically about Julia here, of course).
More specifically, only the part of the def. concerning immutable objects comes into conflict.

That part defines === (testing object identity), the next half defines identity (“no program could distinguish them.”). You never quoted the “programmatically indistinguishability” as the definition of object identity, rather keep differentiating them

To define something, you must rely on something already defined.
The definition for obj. identity (that I had thought you agree on) , and which is confirmed by the page in doc you mentioned, is one I gave in prev reply:

If, on the other hand, you say obj identity is defined by programmatic indistinguishability, then you need to give a precise definition for the latter. Else, you’re just running in circles here.



Anyway: I think I made myself clear what ideas I suggested, and I’m not going to argue endlessly here.
I’m not naive to think that developers will just jump and embrace them. It’s just ideas, that I personally find good, that’s it. Make of it whatever you want.

It’s the specification that the implementation of the language must obey. This is particularly important when you want to make any modifications (including optimizations) since this is the ground truth for what’s allowed and what’s not. This will always have some area left unspecified and those area are implementation detail that can be changed at any time.

I think I should state it another way. What I mean is that you didn’t bring any new significance for the program.
As I said above, as long as your understanding of “new object” doesn’t have any significance at all, I’m totally fine with it. This means that, for example, it’s impossible to define a different kind of comparison (now or in the future) to distinguish between them (without significant/breaking change to the language). This is all what I told you being invalid from the very beginning. Accepting that and sure, you can add as many semantically insignificant (or programmatically insignificant) concepts as you want. You can also give it as much syntactic significances you want if that somehow helps your understanding as long as you don’t push it too far.

I thought I put it pretty clear that I’m talking about programming languages here.

Yes? I agree what follows is a more descriptive definition of object identity and therefore === but I’m saying that it is an definition of object identity. I should probably make it more clear that I was replying to.

If you agrees that the “object identity” is defined both here and on the types page, I have nothing else to argue about the === doc here. (I also remember you mentioned that julia doesn’t define “object identity” somewhere but I couldn’t find it. There are too many edits to track down but sorry if that was just me imagining.)