Swap array contents?


#1

I wanted to ask whether julia arrays support something like C++ vector swap.

That is, swap!(A,B) should swap the contents of both arrays A,B, and throw if they don’t have the same type (essentially swap sizes and data-pointers of the arrays).

I think that this can only be done either by horribly abusing pointers, or by some ccall into the runtime (I did not find it in array.c nor array.jl, but maybe I’m blind?).

Does such functionality exist in some corner of the runtime?

Is there a fundamental reason why this cannot be made to work? If I have pointer-arrays of different age, then I need to possibly root the young one; if shared, then maybe just give up and throw?

(while I am not opposed to horribly abusing pointers, this specific case probably wants to be written in C and either linked against the runtime or included in array.c)

(adding another layer of indirection by wrapping array does not cut the cake, nor does copying the contents)


#2

Maybe I’m being really naive, but is this what you want?

julia> macro swap!(a::Symbol,b::Symbol)
       blk = quote
         if typeof($(esc(a))) != typeof($(esc(b)))
           throw(ArgumentError("Arrays of different type"))
         else
             c = $(esc(a))
             $(esc(a)) = $(esc(b))
             $(esc(b)) = c
           end
         end
         return blk 
       end
@swap! (macro with 1 method)

julia> a = zeros(10);

julia> b = ones(10);

julia> @swap!(a,b);

julia> a
10-element Array{Float64,1}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

julia> b
10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

julia> c = zeros(Int,10);

julia> @swap!(a,c)
ERROR: ArgumentError: Arrays of different type

#3

@favba your method just swaps the names a and b, while I think that @foobar_lv2 is trying to swap their contents. The difference is noticeable if you have other bindings to the two arrays:

julia> a = zeros(2);

julia> b = ones(2);

julia> c = (a, b)
([0.0, 0.0], [1.0, 1.0])

julia> @swap!(a, b);

julia> c  # unchanged, because all the `@swap!` macro did was change the bindings "a" and "b"
([0.0, 0.0], [1.0, 1.0])

#4

(to follow up, I don’t actually know how to do what’s being requested, except by actually copying the data around).


#5

Something that makes me slightly nervous is that I know of no way at all of mutating the sizes or relocating the data of an array with more than one dimension.

I could not see this assumption used anywhere, but I must confess that I don’t understand the other relevant parts of the code (gc, inference, codegen) where this could be hidden.

reshape / jl_reshape_array changes sizes. If it works without bugs now, then no one should assume that multi-dim arrays have fixed sizes.

Meh, no idea. I should read more code.

#6

Just to make this absolutely clear—you mean neither of these two options?

# option 1, swap name
a, b = b, a 
# option 2, copy over contents
function swap!(a, b)
    # check eltype and size ...
    for i in eachindex(a, b)
        @inbounds a[i], b[i] = b[i], a[i]
    end
end

#7

BTW, is it conceivably possible to make this work?

a, b .= b, a

It’s an error now.


#8

Yes. It’s just a, b = b, a.


#9

Nope; I have references to my arrays scattered all over the place, and iterating over the entries… works, but I imagine being a poor CPU who has to move all this data instead of just flipping a pointer, and going on strike?

Something that works, but is also quite wasteful (extra indirection on access) is

mutable struct arraywrap{T,N}
contents::Array{T,N}
end

function swap!(a,b)
c = b.contents
b.contents=a.contents
a.contents=c
end

@yuyichao That swaps the binding only, not the contents. Basically I want to swap pointers and sizes.

a = collect(1:2);
b = collect(3:4);
c = (a,b)
([1, 2], [3, 4])
a,b = b,a ;
c
([1, 2], [3, 4])

#10

And that’s exactly what’s corresponds to swap in c++. C++ and julia has completely different object models. In general, please have a good understanding of the difference between c++ and julia variables before asking these questions…


#11

In how far? Array is not an immutable container (with mutable contents), but rather a mutable container (push!/pop!/resize!, not reshape (I misread the code, sorry; reshape builds a new container for the old contents)).

And in C++, this is exactly what happens: If I swap contents of two vectors (using a.swap(b)), then this content-swap is reflected in all references to the vector that are stored someplace, e.g. below in the call stack. And I pay one indirection for this convenience (vector is like elem_type **, not elem_type *, just like in julia)

Sometimes you want changes to the container like resize!/push!/pop! to propagate through all existing references. Same for content-swaps.

Also, sorry for pissing you off somehow?


#12

As I understand it, the closest equivalent of this in Julia is currently to declare a type like:

mutable struct WrapArray{T,N} <: AbstractArray{T,N}
    a::Array{T,N}
end

that wraps around an array a. You would then have to define size, getindex, etcetera methods on WrapArray (which simply pass through and call the same methods on the field a), so that it acts like an array.

Then you could define:

function swap!(x::WrapArray{T,N}, y::WrapArray{T,N}) where {T,N}
    x.a, y.a = y.a, x.a
    return x, y
end

Since this is a mutable type, the changes to the contents of x and y would be reflected in all references to those objects elsewhere. As you point out, above, though, this adds an extra indirection to array accesses.


#13

In principle, I suppose it would be possible to define such a copy-free swapstorage! function for the built-in Array type, since under the hood this is just a C struct that contains a pointer to the underlying data. I’m not sure if that would screw up any compiler or gc assumptions? Since resize! can already change the underlying data pointer, I don’t see why you couldn’t in principle have a swapstorage! function for two Array{T,N} objects too.

(You’d have to implement such a function in C, though, since it would need to munge internals of the built-in Array type, which is one of the few Julia types whose innards are implemented in C.)


#14

You just need to define the appropriate method:

function Base.broadcast!(::typeof(identity), ta::Tuple{A1,A2}, tb::Tuple{B1,B2}) where {A1,A2,B1,B2}
    ta1, ta2 = ta
    tb1, tb2 = tb
    @assert indices(ta1) == indices(ta2) == indices(tb1) == indices(tb2)
    for i in eachindex(ta1)
        @inbounds ta1[i], ta2[i] = tb1[i], tb2[i]
    end
    ta
end

I’m not saying we should do this though, for one thing this is a different behaviour of broadcasting over tuples than we usually use (which is why it errors).


#15

Not necessarily (although that would be a lot safer). You can get a pointer to the array object, and fiddle with underlying C structure, using unsafe_* calls in Julia.

Edit: if either of the arrays store their data in-line though, I see it might not be possible unless the allocated sizes are the same.


#16

Yes, I could do

function terrible_swap!(A::Array{T,N}, B::Array{T,N}) where {T,N}
    swapspace = zero(UInt64)
    aptr = reinterpret(Ptr{UInt64}, pointer_from_objref(A))
    bptr = reinterpret(Ptr{UInt64}, pointer_from_objref(B))
    for i = 1:8 
        swapspace = unsafe_load(aptr, i)
        unsafe_store!(aptr, unsafe_load(bptr, i) , i)
        unsafe_store!(bptr, swapspace, i)
    end
    nothing
end

A = collect(1:2);
B = collect(3:4);
c = (A,B)
@show c
terrible_swap!(A,B)
@show c

c = ([1, 2], [3, 4])
c = ([3, 4], [1, 2])

That has the chance of terribly crashing the garbage collector down the road, though (and probably fails if the array is very high-dimensional).

I think arrays never store their data inline?

A=collect(1:2)
 pointer(A)-pointer_from_objref(A)
0x0000000000000040

This would be really cool for very small arrays, though (save the allocation, save the cache-miss), but would probably need a lot of changes.


#17

You’ve just seen the evidence that they do.


#18

For the ones stored inline, it would be possible (if the sizes were compatible), to swap all of the contents,
as well as the flags, (but not the pointer!).
I suppose if you had incompatible sizes, it still could be handled, but it looks like the best approach if this functionality is really useful (I have’t thought of a use case myself), is to make a PR with a C function in Julia,
and a Julia wrapper to call it.


#19

You’ve just seen the evidence that they do.

I stand corrected. Thank you!

(I misunderstood new_array and thought this very nice placement was due to both being pool-allocs directly after each other; you are absolutely correct and I failed at reading comprehension)

Re inline: Sorry for using the wrong words. The thing I was thinking about is that an empty vector still has some bytes of capacity in the same cache-line as the struct, so Vector{T}() could start with capacity (64-sizeof(array))/elsize.
Nice if I have a Vector of Vectors, initially all empty, that I push! into, e.g. for n-ary trees with vector of children, first few pushes would not need to allocate / fragment the heap, most vectors are empty (leaf nodes) or small.

@ScottPJones I’ll see. If I write a proper swap in C, then I am sure make a PR; I just need to think a little about possible subtleties: Is something shared? What do I do if the arrays are not inline (jl_gc_alloc) but malloced (jl_gc_managed_malloc, jl_gc_track_malloced_array)? Are there other possible gotchas? What if one of the arrays is a wrap around externally allocated data? Do I really understand all the possible flags?

And then see whether it is worth the effort (alternatives are: copy the data if small or swap is used rarely, wrap the array, redesign until swap is not needed anymore).


#20

One technique I’ve had to use, because Julia’s Vectors are very expensive memory wise, is to have a constant empty_vector (of whatever type I need), and then use that instead of Vector{T}() to initialize.
Then, when I push, I check first if v === empty_vector, and if so, only at that point create a new vector with the element being added, i.e. [element].

One nice thing about this technique is that it’s very easy to initialize a Vector of Vectors with fill!.