Optimization question

nilshg · October 1, 2024, 6:30pm

T is not the same for the vectors and matrices, I thought you wanted distributions in the matrices?

lancejnelson · October 1, 2024, 6:31pm

I do. I’m just trying to understand why some fields are parameterized and others are not. I understand that the big reason for parameterizing is to create a family of types, but parameterizing is also done for performance.

nilshg · October 1, 2024, 6:39pm

My point is that

struct Something{T}
    x::T
    y::T
end

means that x and y have to be the same type, so you’d need two type parameters here if you wanted say x to be a float and y to be a distribution

lancejnelson · October 1, 2024, 8:20pm

Sure. But one point that is made about parameterization is that it prevents blocking and thus reduces allocations and speeds things up. So is it best to parameterize every field in a type or just some of them.


struct myStruct{T}
    a :: Float64
    b :: Vector{T}

vs.

struct myStruct{F,T}
    a::F
    b:: Vector{T}

nilshg · October 1, 2024, 8:25pm

It’s not parametrizing though that allows the compiler to optimise, but rather the specification of a concrete type itself.

In your example, both structs will have the same performance in general, it’s just that the second is more flexible.

lancejnelson · October 1, 2024, 8:31pm

thinking…

so pt below will be concrete even though it’s field b is not concrete?


pt = myStruct(5.5,fill(Gamma(10,0.1),5))

lancejnelson · October 1, 2024, 8:50pm

As a simple example:

struct myType
    x::Float64
end
pt = MyType(3.2)
typeof(pt) # output: MyType

struct myType{T}
    x::T
end
pt = MyType(3.2)
typeof(pt) # output: MyType{Float64}

Isn’t the second type more efficient " because the second version specifies the type of x from the type of the wrapper object." (from the manual)

I guess my question boils down to: How do I know which fields in a composite type I should parameterize, strictly from a performance point of view?

bertschi · October 1, 2024, 9:16pm

Basically, there are concrete types, i.e., there exist actual values of that type, and abstract types of which no values exist. As an example, consider numbers, e.g., Int64 and Float64 are concrete types as can be witnessed by the values 1 and 1.0 with typeof(1) == Int64 and typeof(1.0) == Float64. There is no value of type Real though, i.e., there is no x such that typeof(x) == Real.

Containers, such as Vectors or structs are a bit more complicated, as they can hold concrete as well as abstract types as elements. First, some examples with concrete types:

julia> using Distributions

julia> v = [Gamma(1.0, 1.1), Gamma(1.2, 1.3)];

julia> v |> typeof
Vector{Gamma{Float64}} (alias for Array{Gamma{Float64}, 1})

julia> v |> typeof |> isconcretetype
true  # as witnessed by the value v

julia> v |> eltype
Gamma{Float64}

julia> v |> eltype |> isconcretetype
true

# Gamma is parameterized struct, i.e., has an eltype as well
julia> v |> eltype |> eltype
Float64

julia> v |> eltype |> eltype |> isconcretetype
true

Thus, the types of all containers – the vector and the Gamma struct – as well as of their elements are concrete. In this case, the compiler can optimize as the type of v[i].α can be inferred independently of the index i.

Now, consider a vector holding several different distributions:

julia> w = [Gamma(1.0, 2.0), Normal(1.0, 2.0), TDist(3.2)];

julia> w |> typeof
Vector{Distribution{Univariate, Continuous}}

julia> w |> typeof |> isconcretetype
true  # Sure the value w is our witness

julia> w |> eltype
Distribution{Univariate, Continuous}

julia> w |> eltype |> isconcretetype
false

# Yet, for any specific element
julia> w[2] |> typeof
Normal{Float64}

julia> w[2] |> typeof |> isconcretetype
true  # The element value does indeed exist

How can this be? Each element of the vector is a value with a concrete type, yet the element type of the vector is abstract:

It implies that the compiler has less information and in particular, the type of w[2] cannot be inferred at compile time but requires a runtime dispatch.
The vector cannot store its elements inline as they might have different sizes and memory layouts. Instead, it holds pointers to some values whose types are compatible with the eltype, i.e., the actual type of values is not known until runtime when the value is retrieved.

In Rust, v would be a Vec<Float64> whereas w would be Vec<Box<dyn Distribution>> explicitly stating the different memory layout as well as the required runtime dispatch.

bertschi · October 1, 2024, 9:22pm

For this example there is no difference as Float64 is a concrete type and accordingly myType1 and myType2{Float64} are effectively the same.
The type of a field should be parameterized whenever it would be abstract, e.g.,

struct myTypeA{T<:Real}
    x::T
end

instead of

struct myTypeB
    x::Real
end

Both, allow to create a struct holding any real, but the first allows the compiler to distinguish the (concrete) types myStructA{Int64}, myStruct{Float16}, …

lancejnelson · October 1, 2024, 9:33pm

Great explanation. Thank you. So using a type with a field holding your second example would be less efficient than the case where all of the distributions are of the same type. And there’s no way around this inefficiency unless you just redesign the whole code structure?

lancejnelson · October 1, 2024, 9:34pm

The manual gives the distinct impression that if typeof() produces just the type’s name, with no parameter information,(myType vs myType{Float64} it is a less efficient data structure because it can’t infer the types of it’s fields.

DNF · October 1, 2024, 10:14pm

No, the compiler can indeed know the type of the field, because that type is right there in the definition, which the compiler can see (the compiler reads your code, you know )

The parameter in myType{Float64} is necessary to tell what T actually is, but that isn’t needed in the other type definition, since it’s hard-coded.

Topic		Replies	Views
Using gradient function with Optim New to Julia question	4	1712	September 18, 2017
Reducing allocations (again) New to Julia	10	695	March 24, 2022
Learning to optimize Performance	12	1223	March 15, 2021
Learning Julia by an example of a simple Bayesian linear regression Performance question , performance , distributions , bayesian-inference , linear-regression	10	1139	June 4, 2023
Help optimizing simple example New to Julia question , performance	5	407	March 27, 2022

Optimization question

Related topics