Implementing own `size()` function for a `struct` and its performance

question
array

#1

I am trying to have a struct for one of my types, that (among others for the nonMWE) keeps an array as a value. I would like to have a size function that works on the encapsulated array. The MWE is as follows

import Base: size
struct MyType
    value::Array{Float64,N} where N
    name::String
end
getValue(a::MyType) = a.value
size(a::MyType,k...) = size(getValue(a),k...)

function f(a::MyType,b::MyType)::MyType
  if size(a)==size(b)
    c = MyType( a.value .+ b.value, string(a.name,"-",b.name) )
  end
  return c
end

a = MyType(ones(3,3), "A")
b = MyType(zeros(3,3), "B")

f(a,b)
@code_warntype f(a,b)

So the function f creates a new MyType based on two types and needs size as a check. However, the return type of size that I implemented. The complete code_warntype is

Body::MyType
10 1 ─ %1  = invoke Main.size(_2::MyType)::Any                                                                                                        │
   │   %2  = invoke Main.size(_3::MyType)::Any                                                                                                        │
   │   %3  = (%1 == %2)::Any                                                                                                                          │
   └──       goto #3 if not %3                                                                                                                        │
11 2 ─ %5  = Base.Broadcast.materialize::Core.Compiler.Const(Base.Broadcast.materialize, false)                                                       │
   │   %6  = (Base.getfield)(a, :value)::Array{Float64,N} where N                                                                                     │╻ getproperty
   │   %7  = (Base.getfield)(b, :value)::Array{Float64,N} where N                                                                                     ││
   │   %8  = Main.:+::Core.Compiler.Const(+, false)                                                                                                   │
   │   %9  = (Base.Broadcast.combine_styles)(%6, %7)::Any                                                                                             │╻ broadcasted
   │   %10 = (Base.Broadcast.broadcasted)(%9, %8, %6, %7)::Any                                                                                        ││
   │   %11 = (%5)(%10)::Any                                                                                                                           │
   │   %12 = (Base.getfield)(a, :name)::String                                                                                                        │╻ getproperty
   │   %13 = (Base.getfield)(b, :name)::String                                                                                                        ││
   │   %14 = invoke Main.string(%12::String, "-"::String, %13::Vararg{String,N} where N)::String                                                      │
   └── %15 = (Main.MyType)(%11, %14)::MyType                                                                                                          │
13 3 ─ %16 = φ (#2 => true, #1 => false)::Bool                                                                                                        │
   │   %17 = φ (#2 => %15, #1 => #undef)::Core.Compiler.MaybeUndef(MyType)                                                                            │
   │         $(Expr(:throw_undef_if_not, :c, :(%16)))                                                                                                 │
   └──       return %17

(and I am still trying to understand all of this output) - but directly the first two ones (and from that the third) worries me a little – how can i get my own size function to have the same return type as the original size?

and a short side question – is it (performance wise) the best way to create an own MyType for the result c?

Any further ideas to optimize a code like this are of course also welcome.


#2

Some ideas:

  1. Change Array{Float64,N} where N to a type parameter as such:
struct MyType{Tvalue <: Array{Float64}}
    value::Tvalue
    name::String
end
  1. There is no need for ::MyType for the function output. Julia should infer this automatically.
  2. If the condition size(a)==size(b) is false, c will be undefined. You need to define a default c, or explicitly throw and error if the condition is false.

#3

Neat, that already fixes all my code warnings this issue was about. I still have to learn when typed structs are the preferred way to go.

Concerning 2. – yes I hoped so and yes Julia does, just within trying to find something myself I tried adding that
Concerning 3. – yes of course, I meant to throw an error (actaually the nonMWE code does), just missed that point when shortly making up my MWE :slight_smile:

Thanks!


#4

If you want to dispatch on the struct types to do something special for each type. Performance-wise, it shouldn’t make a difference, builtin Julia types are more than enough for simple programming tasks. Structs can also closely map conceptual abstractions, e.g. Cube, Point, Person, etc. if you are trying to build a complete framework of some sort.


#5

It should matter for performance in this case as the original field type wasn’t concrete. Julia can’t know the dimensionality of the contained array from the type alone. This is different for the parametric type as the type parameter specifies the concrete array type, including the dimensionality.


#6

FWIW, I was commenting on when to use structs for the output of functions as opposed to simple tuples for example. That’s how I understood the question anyways.


#7

I see, makes sense. I obviously took the question as “when should I use type parameters”.


#8

Thanks for the implementation, I will adopt that for my type(s).


#9

My question was mainly when to use typed types (typed structs) instead of just types (structs) – but the idea that Julia can not see the dimensionality of internal variables just from the type helpes a lot. I am indeed working on a certain framework (hopefully the first parts might be finished after getting through all code warnings); and I am indeed mapping conceptual abstractions using structs.
Thanks for the explanation.


#10

FWIW, “typed structs” typically go by the name “parametric types/structs”. See for example the corresponding section on the Julia docs.


#12

While transferring that to my larger project, the error (from @code_warntype) still stays; it’s related to getfield but I actally can’t narrow that further down.

Some lines are just like

  6 ┄─ %37  = (Core.getfield)(%1, :contents)::Any

where none of my own types even have a contents field. Any idea where that comes from? For me it’s really hard to read the code_warntype output. Since Traceur.jl seems to fail (with not finding some function) - is there other tools to help with avoiding too loosely typed lines?

I am a little lost with these optimizations, since I can’t narrow down the code lines where they appear.


#13

A self-contained MWE of the most recent iteration would be very helpful.


#14

Yes, I am aware of that and I am currently not able to extract that. Maybe even getting to the MWE would resolve my problem. So in short: All my MWEs I extract work, the large project I am working on does not and gives me Any code_warntypes. Sorry. As soon as I get to an MWE that reproduces my problem, I’ll come back.

I thought that maybe the generic result posted could already indicate something.

Edit: Compared to the introductionary example, my project code should do the same, but it introduces for the Variable a (and also b) a Core.Box. The type is a little more complicated as shown in the following (where I also try to get my example encapsulated into a runExample.

import Base: size,+
abstract type MyAbstractType end
getValue(x::T) where T <: MyAbstractType = x.value
struct MySimpleType <: MyAbstractType
  value::Float64
end
+(a::T,b::T) where {T <: MyAbstractType} = T(getValue(a)+getValue(b))
struct MyType{TA <: AbstractArray{T,N} where {T <: MyAbstractType, N}} <: MyAbstractType
    value::TA
    name::String
end
size(x::MyType, k...) = size(getValue(x),k...)
function +(a::MyType,b::MyType)::MyType
  if size(a)==size(b)
    d = length(size(a))
    c = MyType( getValue(a) + getValue(b), string(a.name,"-($(d))-",b.name) )
  else
    throw( ErrorException("Both Variables have to be of same length") )
  end
  return c
end

function runExample()
  a1 = MySimpleType.(ones(3,3))
  a2 = MySimpleType.(ones(3,3))
  a = MyType( [a1; a2] , "A")
  b = MyType( MySimpleType.(zeros(6,3)), "B")
  return a+b
end
runExample()
code_warntype(runExample, (); verbose_linetable=true )

However, also for this example the problem of a Core.Box does not occur and I can’t get my MWE any closer to the original code, The first AnyI then get is when computing d within + which is the first point at which the original code tries to access the contents of a.

Again: This MWE does not reproduce the problem, but I can not find any difference between this MWE and my original code, that is not so easy to extract from the package, but also involves this recursive type this example has (and yes sorry I am just running out of ideas what to test).


#15

My final remarks, since I am really running out of ideas is – the only difference from the last code to my complete code is, that the MySimpleType is not a value but a vector ( Vector{Float64}), that does not change the results in the above example, it still has stable types.

However, also only in my larger example (though not in the MWE), adding a constructor to (hence within) MyType like

    MyType{TA,TB}(v,d) where {TA <: AbstractArray{T,N} where {T <: MyAbstractType, N}, TB <: Tuple{Vararg{Int}} }= new(v,d)

yields that code_warntype does not produce any Core.Boxes anymore any no Anys, however the code itself of course does not work anymore, since the constructor now requires the types to be specified on every call (which is tedious to write). On the other hand adding the easier (outer) constructor (so outside of MyType) like

MyType(v::TA,d::TB) where {TA <: AbstractArray{T,N} where {T <: MyAbstractType, N}, TB <: Tuple{Vararg{Int}} }= MyType{TA,TB}(v,d)

solves this problem and the constructors as in the MWE work again. In the MWE this also compiles fine. However in my larger code (with vector instead of float, though that also does not break the MWE)

  • introducing the inner constructor solves all code warnings but the code does not run
  • introducing the outer constructor additionally makes the code run but reintroduces the code warnings.

The definition of a seems to be the culprit, it just appears for the size (variable d) for the first time and axis stored in aCore.Box` - again only in my larger example though the example above is the same with different names copied together into one example. The output is even stranger. Adapted (as said the MWE does not reproduce, and I really don’t know why anymore)

    %23  = %new(MyType{Array{MySimpleType,1}}, %22)::MyType{Array{MySimpleType,1}}
    │           (Core.setfield!)(%1, :contents, %23)
│           (Core.setfield!)(%2, :contents, 10.0)

So it seems to defer the type correctly in its compiled command 23 but then decides to still put it in a Core.Box. Even stranger, the following box that’s filled is for a line that just reads α = 10.(that’s why I added that, too). I don’t understand any of those lines. Of course this causes many other problems when reusing that line. (%22) is the result of the vcat from within the runExample and returns the proper array of SimpleTypes.


#16

Okay, I found the culprit: Besides.`size had another function that was not typestyle and somehow that induced so many boxes and errors that it was hard to find that. I am left with another problem, but that is merely for another thread, because it’s not related to the problem discussed here. Thanks for all your help.