Does Julia have efficient move semantics?

It’s simpler to think of it in terms of the actual implementation.

It behaves the same way as Python, so I assume the implementation is roughly the same.

In Python, every object is a heap-allocated struct which looks a bit like this.

struct PyObject {
    PyType* type;
    (void*) data;
}

I don’t recall the full details or exact specification of this struct, but if you know this, you know enough to understand how everything works. Variables are PyObject* my_var, effectively.

Julia also has this optimization relating to immutable structs. But this is transparent to the user.

There is an example of a move in julia.

v = UInt8[65,66,67]
s = String(v)
# Now, the content of v is gone

Originally, I think it did not copy, but I’m not sure how that works now, I think it copies.
It’s not too hard to implement such moving things via the Vector/Memory interface, but there is no standard syntax for it in julia.

Ah, you assume. And the developers of julia you’re discussing such details with don’t really have the right understanding of julia?

No, a variable in julia is not like in python. It’s not typically some memory chunk which is filled with a pointer to a vector for some time, and a pointer to a string, or int or whatever later. While this is approximately correct for top level variables (i.e. in Main or whatever module), that’s not the typical julia variable. Think of them as labels, entities referencing some value. The compiler will do all sorts of transformations, like creating SSA IR, and it’s not assumed that there is some memory chunk corresponding to the variable.

3 Likes

It copies unless v was allocated specially (by Base.StringVector(n)).

The abstract concepts of variables are definitely in the same ballpark, in a different town from that of C/C++/Rust; there are differences in scoping behavior and one big one I’ll explain later. Definitely not implemented the same. Allocating every object on the heap is one of the reasons that Python is considered “bad” for performance, though it’s worth noting that practical Python has good ways to evade it in other languages. Julia generally stack allocates technically immutable instances (primitives and Tuples or structs thereof) and generally copies them around, but the compiler has leeway on what really happens because none of these are language-level details. That doesn’t necessarily line up with practical mutability; Strings cannot be mutated but they are technically a mutable type, while we can use mutating methods on a technically immutable SubArray to change its data stored in its contained mutable parent array.

I’ve found these concepts confusing and needlessly relative to other languages with vastly different concepts variable, though part of the documented description of pass-by-sharing gets to the point: the variables in the method are assigned to the same instances in the method call. The confusing part is the assertion that “values are not copied when they are passed to functions”, which is only true in a trivial sense that an instance is itself. C/C++/Rust users probably see that phrase and think of data being copied into the stack frame, and that happens all the time. It’s just that on a language level, equal immutable instances (==) are the same instance (===); you could instantiate or copy the same immutable value 1000 times, you only ever had 1 instance, even if they’re reasonably implemented as scattered data copies. This only sorta happens in Python with object caching; otherwise, its heap-allocated objects with separate addresses must be distinguished as different instances. That is also generally the case for mutables in Julia, though String is a notable exception for reasons I still don’t know.

Strings should have been immutable, they behave as immutable, but for technical reasons they are not, though there is no way to mutate them. Egal comparisons (===) between them therefore end up in a memcp here: julia/src/builtins.c at master · JuliaLang/julia · GitHub
Ideally, two strings with the same content should have the same address, but that would incur a search through existing strings whenever a string is created. (I think R does that, but I haven’t been in there for some years).

1 Like

Yeah with the === implementation, I don’t know why they aren’t also just exposed as immutable (according to ismutabletype anyway) on a language level, and we just have to know that it’s heap allocated due to its unfixed size. I just assume that if I do bother to dig into it, I’ll fail to come up with a good way to do that.

Is this the same as interning? Symbols do that, though I don’t know how those are implemented either considering there’s so many of them in parsing.

We should probably disambiguate two things

  • What you show here is more like a “consume” operation. The data from the input is indeed gone, just like a move semantic, but because you have converted from one data type to another, this isn’t quite the same as the highly efficient “move” where some pointers/references/whatever you want to call them are swapped.
  • The “efficient move” semantic I really intended was closer to this from C++ std::move(x). It’s a bit tricky to explain what that does in a sentence or two. On a technical level, it tells the compiler x is an “xvalue expression” which means expiring value. A simpler explanation is it tells the compiler you won’t be using the data referred to by x anymore. This can cause the compiler to use things like move constructors in expressions like y = SomeType(std::move(x)), if they exist.

If you read what I said:

The point being, it does not matter what the exact implementation is, because the behavior is the same.

If you can demonstrate an example of how two types (eg a list/array or something) behave differently between Julia and Python, please go ahead… If there are relevant behavioral differences, then it would be interesting to hear about them.

Yes, Julia and Python are different because Julia is compiled using LLVM and Python is compiled to bytecode which is then interpreted. But this is a totally pointless statement and irrelevant to the whole thread, because it doesn’t say anything about how they differ from the point of view of a user.

My point being: Don’t just say “no you’re wrong” and then proceed to either agree or just go on an irrelevant tangent.

Edit: I re-read your comments and I think your point is Julia is able to optimize away some of the things you would can’t optimize away in Python, such as everything being allocated on the heap. Therefore there is a behavioral difference in terms of the performance. Fair enough, but that wasn’t the point of this discussion. We already know the performance of Python is bad compared to compiled languages.

You may have missed this earlier comment. That’s what I’m referring to here…

Your whole point was that you view julia variables like python variables because they behave in the same way. I.e. a variable is a type descriptor and a data pointer.

This is not good advice. Such a mental model for julia variables will make the performance hints in the manual very hard to understand. And, after all, performance is one of the strong properties of julia. The whole point of making your functions type stable is that the types then exist only in the compiler, not at run time. This is also the reason why one should avoid abstract field types in structs, in particular abstract element types in Vectors.

Have you received a satisfactory answer to your question? I still can’t tell what exactly it is that you are asking.

Is there a std::move in Julia? No, there isn’t. Julia’s memory model is different to C++ (@foobar_lv2’s answer had many good points). Is there a way to mimic std::move? Yes, there is to some extent (see the Ref examples).

Could you post a snippet in C++ (or any language) that shows what you want to do? That might be the best way to get a good, definite answer. [Unless you look for countless arguments about the (dis)similarities between Julia and Python and C/C++, which rarely end in meaningful conclusions. We’ve had many such threads here.]

2 Likes

This was my original question.

This has nothing to do with what is being discussed here.

Ok, I think I understand your misconception now. In julia, local variables don’t have addresses. Instead they are “bindings” that refer to an object.

So, how would you do the swap?

Since local variables have no address, you must explicitly allocate the memory / slot for them, and then you pass this slot to the swap function:

a=Ref(4)
b=Ref(5)
swap(a,b)

In many cases this will boil down to the same machine-code as the C variant. In some cases it will be more like:

uint64_t* a = malloc(8);
uint64_t* b = malloc(8);
*a = 4;
*b = 5;
swap(a,b);

The reason is that the heap alloc can only be elided if a and b don’t escape due to the swap call. In C it doesn’t escape on pain of UB; in Rust, there are all the complex borrowing rules to ensure that it doesn’t escape; in julia, the compiler makes a best-effort to figure out whether the references escape, and when in doubt then a heap allocation is made.

Unfortunately it is quite buggy:

julia> function foo()
       v=UInt8[0x41 for i=1:801];
       popfirst!(v)
       vv=reshape(v, (800, 1))
       vvv = reshape(vv, (800,))
       s = String(vvv)
       Ref(s)
       end
foo (generic function with 1 method)

julia> foo()
Base.RefValue{String}(UInt8[0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41  …  0x41, 0x41, 0x00, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41, 0x41])

This clearly leads to a type confusion which will blow up everything.

I couldn’t see a way to finagle the confused writes in julia/src/genericmemory.c at 9a77240f6f81b9d5999d40fdf5e2b6bb84e36783 · JuliaLang/julia · GitHub into a good primitive for exploitation, though.

PS. Type confusion with jl_genericmemory_to_string · Issue #56435 · JuliaLang/julia · GitHub

1 Like

Yes, of course. Approximately this (please excuse any syntax errors, I’m doing this entirely from memory from a machine which does not have a C++ compiler)

void moveVector(std::vector<double>* &v1, std::vector<double>* &v2) {
    std::vector<double> *tmp = v1;
    v1 = v2;
    v2 = tmp;
}

int main() {
    std::vector<double> v1 = std::vector<double>();
    std::vector<double> v2 = std::vector<double>();
    for(std::size_t i = 0; i < 1000; ++ i) {
        v2.push_back(0.0);
    }
    std::vector<double> *v1p = &v1;
    std::vector<double> *v2p = &v2;
    moveVector(v1p, v2p);
}

Assuming I didn’t mess that up, it is probably the most simple example. I avoided writing an example in terms of std::move, although this would have been easier. I didn’t choose that route because it is opaque if you don’t have a detailed understanding of how std::move works.

Let me know if I made a mistake, it’s been a number of years since I wrote C++ day to day.

1 Like

I suggest the following:

v1 = fill(0.0, 1000)
v2 = fill(1.0, 1000)

v1, v2 = v2, v1
1 Like

Is that second line supposed to be v2 = tmp?

Good spot - edited

You really are having a hard time with this…

julia> v1=[1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> v2=[4,5,6]
3-element Vector{Int64}:
 4
 5
 6

julia> function testIt(v1,v2)
       v2,v1=v1,v2
       end
testIt (generic function with 1 method)

julia> testIt(v1,v2)
([1, 2, 3], [4, 5, 6])

julia> println(v1)
[1, 2, 3]

You can do

julia> function testIt(v1,v2)
       tmp_r = v1.ref
       tmp_sz = v1.size
       setfield!(v1, :ref, v2.ref)
       setfield!(v1, :size, v2.size)
       setfield!(v2, :ref, tmp_r)
       setfield!(v2, :size, tmp_sz)
       nothing
       end
testIt (generic function with 1 method)
julia> v1=[1,2,3]; v2=[1]; testIt(v1,v2); @show v1,v2;
(v1, v2) = ([1], [1, 2, 3])

Note that this is somewhat unsafe – you’re not really supposed to do that. Especially this probably elevates concurrency bugs from “best-effort constrained bug” to “lol heap corruption, pop a shell”.

1 Like

Variables in julia are not chunks of memory where you can store things, they are names with a scope. The names v1 and v2 inside testIt are not the same names as outside, they merely refer to the same values/objects. You don’t have access to the outside names v1 and v2 inside testIt, unless you annotate with global.

But you can ditch the whole function, and just do v1, v2 = v2, v1.

4 Likes

And I have the feeling that you don’t understand std::move. That is entirely about the transfer of unique ownership for the sense of resource management / RAII! And since julia and C don’t have unique ownership / RAII, std::move makes no sense.

Your example works in pure C and has nothing to do with C++. The crucial thing you did here was to take the address of a local variable. The difference is that julia local variables have no addresses! So you must explicitly create a slot/container, and then swap contents; whereas in C, the slot/container is created automagically by the compiler if it sees that the address of a local variable is taken.