Field types

I know that the following:

struct test{T<:Real}
    a::T
    b::T
end

is better than:

struct test
    a::Real
    b::Real
end

but why can’t I fully specify the field types like this:

struct test{T::Float64}
    a::T
    b::T
end

Wouldn’t this allow the compiler to optimize better?

I think you’re confusing optimization with how much typing you have to do. With your last example, you’ve effectively written:

struct test
    a::Float64
    b::Float64
end

That’s more typing, but it’s identical to what I think you want your example to mean so both would identical efficiency.

4 Likes

Well now I’m confused about why my first example allows for better optimization over the second.

In the first example, a and b have to be of the same concrete type, whereas in the second example, a and b can be different concrete types (as long as they’re both subtypes of Real)

Your first example creates an infinite family of types – one for each value of T. For example, when T === Float64, your first example creates the equivalent of:

struct test
    a::Float64
    b::Float64
end

Your second example instead is exactly:

struct test
    a::Real
    b::Real
end

So one type has concretely typed fields (of type Float64) and the other has non-concretely typed fields (of type Real).

3 Likes

So the compiler doesn’t need to know what the exact concrete type is a priori, just which fields have the same concrete types

No, the fact that there are two fields in this struct is a total distraction. The same problem is present in

struct test
    a::Real
end

which is different from

struct test{T <: Real}
    a::T
end 
6 Likes

Depends what “a priori” means since Julia isn’t statically typed. In the sequence of lines of code below, the exact types are known when an object is created:

julia> struct test{T <: Real}
       a::T
       B::T
       end

julia> test(1.0, 2.0)
test{Float64}(1.0, 2.0)

I strongly recommend reading Types · The Julia Language and Types · The Julia Language

I have. I’ll re-read it though.

So for the following struct:

struct NS
    nParticles:: Int64
    setSize:: Int64
    l:: Int64
    boxSize:: Float64
    energies:: Array{Float64,1}
    activeSet:: Array{SVector{2,Float64},2}

end

should be re-written to:

struct NS{T <: Real,S <: Real}
    nParticles:: T
    setSize:: T
    l:: T
    boxSize:: S
    energies:: Array{S,1}
    activeSet:: Array{SVector{2,S},2}

end

There is no efficiency gain from that change, you’ve just increased the number of valid types that can be bound to T and S. The first example is also somewhat broken: you have T and S parameters, but they’re never used.

2 Likes

So basically what’s going on here is that when you have

struct test
    a::Real
end

then any time julia wants to look inside a test object, it has no idea what it’s going to get out, all it knows is that the data it gets will be a subtype of Real, but subtypes of Real could have any memory layout imaginable, and any set of methods defined for them, so there are essentially no optimizations that can be performed until Julia actually unpacks the struct itself and looks at the concrete type.

On the other hand, when you write

struct Test{T <: Real}
    a::T
end

then a Test(1) has a different type from a Test(1.0) (that is, Test{Int} vs Test{Float64}) and that information can be used to do concrete optimizations because the memory layout is thus fixed forever, and the methods on Int and Float64 are fixed in a given worldage.

10 Likes

Writing

struct Test{T}
   a::T
end

can basically just be thought of as convenient syntax for writing

struct TestInt
    a::Int
end
struct TestFloat64
   a::Float64
end
struct TestReal
   a::Real
end
...

for every single possible subtype of Real. The parameters basically just let us easily define a group of implicitly defined types. Test{Int} and TestInt have all the same properties, Test{Int} is just easier to work with.

2 Likes

So what is the proper way to build this struct?

That depends entirely on what you want it to hold. Do you only want to store Float64 data? then your first example is fine (once you remove the unnecessary type parameters). Do you want to be able to efficiently store any types T <: Real and S <: Real? Then use the second.

1 Like

But in the second case the types of the fields can be inferred from the type of the wrapper object (from performance tips in the manual).

So even

struct test{T}
    a::T
end

would be better than:

struct test
    a::Float64
end

right?

No. There are two orthogonal concepts here:

  1. Using a parametric type instead of manually creating multiple similar types.
  2. Minimizing the use of abstract types for efficiency.

Let’s consider three code options:

Option A: Manually write out multiple types

struct TestInt64
    a::Int64
end

struct TestFloat64
   a::Float64
end

Option B: Use a parametric type

struct Test{T <: Real}
    a::T
end

Option C: Use a single type with abstract fields

struct Test
    a::Real
end

There is no difference in efficiency between A and B – the question is whether you create a lot of redundant types or a single parametric type.

There is an efficiency improvement between B and C – for any given type under Real, B creates a new “customized” struct type that has a concrete field, whereas C reuses the same inefficient struct type every time.

8 Likes

It’s ‘better’ in that it can store any type. But if you’re only wanting to strore Float64, then it’s the exact same.

That is, if you have

struct Test1{T}
    a::T
end 

struct Test2
    a::Float64
end

then Test1(1.0) is basically the exact same thing as Test2(1.0).

1 Like