Memory and composition

If I have a struct that has another struct as a field, is the latter always copied as a whole? Eg

struct Large
    a::Int
    b::Int
    c::Int
end

struct UsesLarge
    l::Large
    d::Int
end

l = Large(1,2,3)
sizeof(l)                       # 24

u1 = UsesLarge(l, 4)
sizeof(u1)                      # 32

I am asking because for a problem I am working on, several small objects y only make sense in context of a larger one X. Currently the calling convention of my functions is dostuff(X, y) but it would be very natural to compose the X into the ys. However, I am worried about memory, because there are many ys and X is large.

I don’t know how to reason about size in this case. Can I keep my objects immutable (struct) and yet share memory for the Xs? Is there any way to ensure this? I found this in the devdocs but I don’t know enough to understand it.

1 Like

Make your X into a mutable struct, this way it is copied by reference and not value.

your y’s will still be structs, that is:

  1. copied by value
  2. efficiently passed into a function
  3. eliminating several indirections when accessing fields , due to compiler optimizations.
type Large
    a::Int
    b::Int
    c::Int
end

immutable UsesLarge
    l::Large
    d::Int
end

L = Large(1,2,3)
y1 = UsesLarge(L,10)
y2 = UsesLarge(L,20)

y1.l.a = 100
y2.l.a == 100 #true

I am still on 0.5 waiting for some packages to update to 0.6, hence the type and immutable
and not struct

Thanks, but this is more of a workaround ­— I like to mean what I say when coding.

Also, it is unclear to me if struct of structs would always be unshared, or just sometimes, up to the discretion of the compiler. I found
https://groups.google.com/forum/#!topic/julia-users/ZPSiiK4b1T0
and from way back

where for aggregates, this is discussed as a feature. Maybe I am trying to accomplish something silly, but sometimes I want to opposite.

one way to achieve this is by telling UsesLarge that l should be a reference. You can do this by not typing it concretely. For example just change the following but leave everything else the same

struct Large{T}
    a::T
    b::T
    c::T
end

Now Large (as in l::Large) is not a concrete type anymore

julia> sizeof(u1) 
16

As I understand it this depends on being a struct as well concretely typed. Both must be true.

EDIT: though I am not sure if this actually fixed the thing you are interested in (i.e. not copying all the time). I suspect all this does is avoid the symptom

This is not a workaround, moreover it is not something new or innovative to Julia.

in C a struct is just as in Julia, passed by value.
a pointer to struct is just as a mutable struct in Julia, the pointer itself is passed by value
which means the struct which it points to is “passed by reference”
because two copies of the same pointer point to the same struct.

In C there is special notation for accessing fields whether its a struct or a pointer to a struct, in Julia this is
hidden from you, along with the overhead of keeping track of references and cleaning unused mutable structs
(structs that no pointer references hence are un-acceesible)

in 0.6 lingo
A struct of anything is passed by value, copies are made, it is “unshared”
A mutable struct is passed by reference , a copy of only the pointer(address) is made, it is shared

Not really. It is always semantically shared, just that it’s not mutable. It is not copied when passing around, only copied when assigned to external slots as a requirement of the ABI.

Why not use a Ref{T}:

struct UsesLarge
    l::Ref{Large}
    d::Int
end

l = Large(1,2,3)
u1 = UsesLarge(Ref{Large}(l), 4)
sizeof(u1)    #16

And add a constructor

function UsesLarge(l::Large, d::Int)
    UsesLarge(Ref{large}(l), d)
end

Thanks. I have always thought of Ref as part of the FFI, but this looks like a workaround. However, I then have to extract the l from the RefValue.

Is there a way to compose yet share structure somehow, for one (immutable) structs containing another in v0.6? Or should I use Ref as suggested by @jtravs?

I see your point, but that is just an optimzation you can guarantee because you demand immutability.

if like in C you could mutate structs and moreover have more than one thread running, you would have to copy
whenever passing it around.

Anyway objects containing large data like vector and arrays are usually mutable structs.

EDIT:
@yuyichao
Why in the following example using unsafe code , I can mutate an immutable in its scope, but not when
I am trying to get a pointer to it inside a function?

immutable Large
    a::Int64
    b::Int64
    c::Int64
end

L = Large(1,2,3)
ptr_x = Ptr{Int64}(pointer_from_objref(L))
unsafe_store!(ptr_x,100)
L.a   # prints 100 I mutated an immutable

f(x) = begin
    ptr_x = Ptr{Int64}(pointer_from_objref(x))
    unsafe_store!(ptr_x,-100)
end
f(L)
L.a #still 100 although according to what is said here should have been -100

You are violating the semantics of the language so anything can happen,

thats a bit harsh :slight_smile: , I am trying to work with the language using the language.
Anyway if we remove the meddling unsafe_store! we still see the pointers are different.

How can you prove your statement:

Beside the usual inlining optimizations llvm does ?

… It’s a bit hard to find a way that’s not… Maybe I’ll just go with the traditional saying: it can “make demons fly out of your nose”

That pointer is invalid on construction and is insignificant.

We don’t do any of them. We do pass it as pointer between functions when it’s too big though.

2 Likes

While demons are interesting, replies to the original question would stil be appreciated.

Am I doing something un-idiomatic, composing a value into multiple instances? Is there anything better than Ref for this? Even a definitive statement that there isn’t would help.

Am I seeing something like that in action in this MWE?

using StaticArrays              # just to gobble memory

struct Large{N, T}
    X::StaticVector{N, T}
end

struct Small{N, T}
    y::Float64
    X::Large{N, T}
end

X = Large(@SVector rand(1000))  # abusing StaticArrays
sizeof(X)                       # 8
sizeof(X.X)                     # 8000
ys = [Small(randn(), X) for _ in 1:500]
sizeof(ys)                      # 500*8 = 4000, where is X?

Where can I learn more about this?

No It’s not something you can observe at this level. The object layout is well defined. Pointerfree leaf immutable types are inlined, anything else aren’t. You are just observing an non-leaf type.

1 Like