Julia suitable language for operational weather analysis?

Hello community,

I have programmed Matlab from 1997 up to now and now I think about changing to Julia (skipping Python). Working at a private weather company in Austria, I have created a lot of Matlab programmes which are analyzing the weather on an operational basis scheduled by crontabs on an 10 minute up to an hourly basis.

I have heard that Julia is fast in the REPL mode after a first run, but it seems, that Julia code is very slow when executed from the command line (julia …jl) because every new call is equivalent to the first call in REPL. But if Julia code is called by crontab/cronjob, the fast usage from REPL mode is not possible.

Is there generally a way to either call Julia code in REPL every 10 minutes/ 1 hour or to call Julia programmes from command line (with cronjob) in that fast way Julia is known for?

If yes, a change from Matlab to Julia would be a good way, but if Julia is only fast when used in REPL mode, it would unfortunately not be.

Best regards,
Dieter

1 Like

You may want to look at
https://github.com/dmolina/DaemonMode.jl

The “slow” at the first call is a bit overemphasized. It is just, that at the first call the code is compiled and this compilation time is sometimes annoying long and of course sums up if you start again often, e.g. during development (there is https://github.com/timholy/Revise.jl for the development issue).
But not in any cases this compilation is a problem. E.g. if the process itself just runs long enough the compilation at the beginning just doesn’t matter.

So, I would encourage you just to try it out and if you run into such issues there are already some solutions or workarounds.

By the way, the issue has a name, it’s TTFP (time to first plot). You may search for it to find more discussions.

7 Likes

Edit: I misunderstood, but the message is still valid, depending on how long the task itself takes.

The JIT just means that the first time a function is called, it will take a bit longer due to compilation. Thus making running julia blah.jl every few seconds not worth it[1].

Just like your MATLAB probably always run in the background (you don’t cold start MATLAB every time you run the analysis right?), you can use DaemonMode mentioned above.

[1]: it also means this is not a good way to develop a script, check out Home · Revise.jl

2 Likes

I think he meant he’s running the script every 10 minutes or every hour, not that it takes that long to run.

For “production” usage, I think PackageCompiler is pretty good to bring down startup latency, possibly bundled up in a docker container if you are using docker already.

5 Likes

How long does your program run for? e.g. 1ms, 1s, 1min, 1hr?

1 Like

There are ways to reduce compilation time or to avoid compilation each time you are running your program - DaemonMode.jl is one of them. Still, if you aim is just to run your script every 10 min and the total compile+run time ist less than that - than just start it from cron job and that’s it.

You are going to spend some time learning Julia, and getting around the TTFP issue shouldn’t probably be your fist priority.

3 Likes

Just adding that, for example, a script I have that loads large packages as DifferentialEquations, Plots, Catalyst, runs the simulations and produce plots takes 50s overall. And this is the worst case I have experienced up to now. It may we’ll be that startup and running your script doesn’t take more than a few seconds. Thus, this should be a primary concern if you needed to run the script repeatedly in the sub-second time scale.

1 Like

In Matlab the compiled stand alone script takes ~10 minutes. 30% of the time is spend by creating large sparse matrices and solving the corresponding system of linear equations.

Is it okay for your application if the Julia package loading adds an extra 5-90 seconds?

1 Like

Thanks for your reply and hints.
After creating Matlab programs for operational purpose they are compiled to stand alone executables and called by cronjobs (without needing Matlab in background).
If running “julia blah.jl” is only some percent slower as in REPL mode (second call) there will not be a problem.

The other option is to instead of running cron jobs, just have Julia run continuously and it just starts up a new fit every 10 mins. (not talking about DaemonMode, just write your script so that it looks like)

while true
   #... do some stuff
   sleeptime = nexttime - time()
   sleep(sleeptime)
end
3 Likes

Good idea, if there is no temporal overlap (if run takes e.g. 12 minutes and the function should be called every 10 minutes)

If the time saving potential compared to Matlab is much larger than these 5-90 seconds I suppose yes

while true
   Threads.@spawn domodel()
   nexttime = calcnexttime()
   sleep(nexttime - time())
end

On Julia 1.7+ spawning a thread/task will make sure your model spawns off right away, and will run on an available thread. If you have many cores, you can be running overlapping runs provided they don’t fall too far behind (like if you need to run light calculations every 10 mins and then a detailed calculation every hour)

5 Likes

Just thought I’d add that I run a live trading program from Julia doing exactly this and have never had a problem, and that program has to hit certain time points exactly for specific bits of code, like opening and closing auctions, etc. So it is definitely a feasible solution.

4 Likes

I really want to emphasize how superb PackageCompiler is now. I, too, am new to julia, and I thought the ~20s startup time was going to be a dealbreaker, but creating a custom sysimage with all the packages I use has reduced it to ~1s without pre-warming the REPL.

create_sysimage([:Plots,:ModelingToolkit,:OrdinaryDiffEq];
        precompile_execution_file="my_fancy_cronjob.jl", replace_default=true)

This case, where specific code is being run repeatedly, would seem to be an ideal use-case for a PC sysimage.

The cost is that making modifications to the code may be slower, but if you’re already compiling matlab code then there shouldn’t be a significant loss here.

6 Likes

And see https://github.com/SciML/DifferentialEquations.jl/issues/786 . We’re just starting to take compile times very seriously. The OrdinaryDiffEq ones have dropped dramatically. ModelingToolkit needs a pass. And also, we could probably use another pass at Plots.jl.

5 Likes