New Julia package: Workspace.jl

Ronis_BR · November 25, 2016, 2:22am

Hi guys!

I deal with a lot of simulation using julia. Actually, I am building an entire satellite simulator for validation of mission operational concepts in Pre-Phase A in julia.

I always have trouble to deal with many variables in distinct instants. For example, to integrate using Runge-Kutta (4th order) I need to compute each variable in four instants. Hence, I ended up with something like this:

q_k_1, q_k_2, q_k_3, q_k_4.

The biggest problem is that each variable in each of those instants is a function of many (MANY) other variables of those instants. So, I had something like that:

q_k_1 = a_k_1*b_k_1 + c_k_1*exp(d_k_1)
q_k_2 = a_k_2*b_k_2 + c_k_2*exp(d_k_2)
q_k_3 = a_k_3*b_k_3 + c_k_3*exp(d_k_3)
q_k_4 = a_k_4*b_k_4 + c_k_4*exp(d_k_4)

Hence, I decided to create a very simple package, called Workspace.jl, to make my life easier. This package defines a workspace, which is an array of variables, that can be loaded and saved from the global workspace. So, I can do things like:

ws_k_1 = @create_ws
ws_k_2 = @create_ws
ws_k_3 = @create_ws
ws_k_4 = @create_ws

ws_k_1[:a] = # Compute the value of variable a_k_1
ws_k_2[:a] = # Compute the value of variable a_k_2
# etc.
ws_k_1[:q] = # Initialize the variable q_k_1
# etc.

for w in (ws_k_1,ws_k_2,ws_k_3,ws_k_4)
    @load_ws w
    q = a*b + c*exp(d)
    @save_ws! w
end

This was a simple example, but it really transformed my code into something much more readable. Well, I know I lose performance because the @load_ws and @save_ws! involves too much writing. But, in the end, the gain in maintainability is bigger

The project is at the beginning. The implementation is ugly (I’m not a good julia programmer), but I think I can use some help / advice from the community. The URL is: https://github.com/ronisbr/Workspace.jl

Best regards,
Ronan Arraes

akis · November 25, 2016, 3:40am

Welcome! I had a quick look at your code. Given that your workspace is represented by a searchable Array of variable (Symbol, Any) pairs, have you considered of using instead a Dict{Symbol, Any}?

ChrisRackauckas · November 25, 2016, 4:21am

What about using Parameters.jl? Its packing/unpacking when done on strongly typed fields is pretty much optimal performance, and the result is pretty similar looking to this. Building the last part (the repitition handling) off of that would be useful.

Also, relatively simple algorithms like RK4 are rarely ever the right choice. Scientific programs like this are both more maintainable and more performant by calling out to appropriate solvers for linear systems, optimization, differential equations, etc. rather than re-implementing each detail as needed.

DNF · November 25, 2016, 8:45am

Looks to me like you are overcomplicating things. If you have lots of variables named q_k_1, q_k_2, etc. that’s a strong sign you should be using vectors.

If you define length-4 vectors q_k, a_k, b_k, c_k then your example code reduces to this:

q_k .= a_k .* b_k .+ c_k .* exp.(d_k)

and then you also have far fewer variables to pass around.

andyferris · November 25, 2016, 10:39am

+1 to vectors. But you could consider StaticArrays.jl and use an SVector{4} to keep everything on the stack and as fast as explicit code.

Ronis_BR · November 25, 2016, 3:58pm

Hi! Thanks for the answer.

However the problem is much more complicated than that of the example I wrote.

Actually, the number of lines (considering some Fortran functions) that we need to compute each one of the steps sums more than 1,000. So, this kind of vectorization, which is more or less I was trying to use, leads to a very difficult code to maintain.

Ronis_BR · November 25, 2016, 3:59pm

Thanks! I will look this package, the only problem is that the number of variables in each @unpack would be almost 60…

Ronis_BR · November 25, 2016, 4:00pm

Thanks! Would the performance be improved by using Dict instead of Arrays?

ChrisRackauckas · November 25, 2016, 4:27pm

If you define a type with @with_kw, then there’s an @pack and @unpack that will automatically do all of the arguments.

akis · November 25, 2016, 4:27pm

That really depends on the number, the usage frequency and the scarcity of existence of the variables in the various workspaces. My advice to use a Dict was based on your current implementation and what you said:

So, although a Dict will improve performance in some scenarios, it will primarily improve maintainability of your code. But if performance is an issue in your case, you should rather go with what other users suggested above, mainly to avoid variables of type Any, which prevent the compiler from type-specific optimizations. Those suggestions don’t have the simplicity of using a Dict, but they take performance very seriously.

DNF · November 25, 2016, 4:43pm

I guess I don’t understand what you mean, since you can easily have thousands or millions of elements in a vector, and it would be super clean. So you are clearly talking about something else.

Do you have a slightly more realistic example, that shows why a vectorized implementation is impractical?

ChrisRackauckas · November 25, 2016, 6:04pm

Either way the performance is going to be not very good. I think if you’re going this route, just use what’s easiest.

Ronis_BR · November 25, 2016, 7:35pm

OK! In my preliminary analysis, the integrator code became much more readable. This is what I am trying to achieve since many people who just know matlab will need to verify if everything is OK.