Rules of thumb for choosing between storing information as struct field or static parameters


#1

TLDR; When is it interesting to “store” field values as static parameters instead of struct fields?
as in

struct test{fieldA #= This is a Tuple{Int,Int}=#}
    fieldB::Vector{Int}
end

vs

struct test
    fieldA::Tuple{Int,Int}
    fieldB::Vector{Int}
end

The background for this question: I’ve coded (am coding) a pseudospectral 3D fluid flow simulator where I’m running direct numerical simulations of turbulent flow that can take hours/days. So I really don’t care about precompilation time. So, in my code I’ve gone crazy with static parameters. I’m using it for storing viscosity, wavenumber vectors(using tuples or static arrays), target cfl number, domain size… and almost every stuff that doesn’t change during the simulation. My idea is moving as many information as possible to compile time. I really don’t care about compilation time.
But I kind of done this blindly, exploring the language, not really knowing when something like that would be usefull.
I guess knowing for-loops ranges at compile time can lead to improvements, so that’s one rule of thumb: Store for loops ranges as static parameters. When else is it truly useful?


#2

Usually never. The goal here is to get as much information at compile time. However, when you do this every single test is a different type (since the concrete types are the parameterized form). This means that any array of test is going to not be type-stable, which actually reduces the amount of information at compile-time. So in general this kind of over-typing will backfire unless you have a good reason to do it.

Besides, you haven’t told us that indexing fieldA actually matters for performance. Does it? Don’t just assume it does, benchmark and profile it. Do solid development, then listen to profiling and react based on what you see.


#3

I should’ve made it clear. In my case I have a simulation struct. I’ll never have an array of simulations. I build one simulation struct and call run_simulation(my_simulation) and that will be it (running for hours). My example was indeed totally different from my use case. I don’t have any type stability problems.
The information I talking about are the simulations parameters, don’t truly don’t change during the simulation. Grid points, fluid viscosity, wavenumbers …

I am trying to push to the boundaries with the dynamic features of Julia. I’m trying to make this code as fast as (could it be faster??) my advisor’s highly optimized C++ code. I am considered an heretic here in the lab for not using C++ :smile:


#4

But you are right: just benchmark the damn thing. :smile:


#5

That’s wonderful! From my experience, it can be at least as fast as C/C++, which isn’t saying a little! :grinning:


#6

Julia v0.7 will do constant propogation on literals passed into functions and will specialize compilation based on those constants, so at least after the update there will be zero need for even thinking about this. If these are truly constant, you can also just use a global const and the compiler will utilize this.


#7

My hope is that by moving all that information to compile time, which cannot be done in C++, I’d give Julia a head start to beat C++.
It is almost as using Julia as a code generator, generating very specific code for every problem that I feed it. If i’m not mistaken that is the trick of FFTW, computing VERY specific DFT for the inputs.


#8

@ChrisRackauckas is spot on. I would summarize some things you want to consider when deciding if a value should possibly be a type parameter:

  1. Does dispatching on this value make sense? Would it be better/easier to just look at the value at runtime with conditionals or indexing? Don’t get too fancy with dispatch.

  2. Will it be typical to have collections of values that are homogeneous with respect to this value? For example, while it’s possible with Unitful to have arrays like [1m, 2m^2, 3m^-3] it is probably more common to have vectors where the units match.


#9

See also