Resources for running julia routines in the background 24/7

question

#1

Hi all,

I’d like to write a routine in Julia that will run 24/7, possibly for weeks or months at a time. It will run on one computer which won’t be used for much else other than running the Julia routine. The routine needs to be able to call other functions at specific times of the day, possibly thousands of times a day, but can be idle for the rest of the time. Sometimes one of these other functions might not achieve what they’re supposed to, since they’ll be interacting with the internet, so I would also need to be able to have the facility to tell the main routine to wait and try again in a couple of seconds or two if that happens.

I’ve never written this kind of code before, and so was wondering if the community could point me in the right direction for what functions and/or packages I should be looking at. I’m just asking for pointers in the right direction and will happily go off and do lots of reading once I know the main resources. I just figured it was best to ask here first so I don’t waste too much time reading up on red herrings.

Cheers and thanks,

Colin


#2

Julia and most of its packages are geared towards interactive use: when things don’t work out, you get an error that unwinds the stack (unless you catch it). This is because of the implicit assumption that the user would like to know about these errors which require intervention to fix anyway, and the cost of failing and restarting is low.

This is a good model for scientific computing, but not 24/7 services. You did not specify what you are doing and how the calls are initiated (triggered by exogenous events? or times?), but you may be better off just building a user image with the packages you need (for faster startup times), and using another framework as a governor.


#3

Just an idea… How about using something like a home automation system such as Home Assistant, which would give you a lot of things for free (scheduling, notifications, monitoring, etc.) Then save and load data from Julia scripts as desired. (I have such a system running, and it does run (very) simple Julia scripts from it, so it’s feasible, at least.)


#4

Thanks for responding.

To start with, the events will just be triggered by times, so it sounds like you’re suggesting that I’d be better off just using cron (I’m on Ubuntu) to call specific scripts to run at specific times, and make sure that all the packages I’m going to use are pre-compiled so my start-time for each script is minimal. Have I understood your suggestion correctly?

Cheers,

Colin


#5

Interesting suggestion, thanks for responding. It might be over-kill for what I’m trying to do at the moment, ie if I can’t use julia, then I could probably get away with just using cron, but I’ll keep it in mind for the future.

Cheers,

Colin


#6

Just out of interest, how bad an idea do you think it would be to use sleep(1) inside an infinite loop, and check the time after each sleep of one second, and if the time lies in one of the intervals of interest, then call the specific function that I want to run at that time.

It sounds like a massive hack that could run into problems after running for several weeks… although having said that I could easily reset the routine every couple of weeks if there was some sort of really slowly accumulating memory issue.

Just speculating out loud… :slight_smile:


#7

If you just want to run your routines at specific time during the day, cron is perfectly well suited for this.

If, on other hand, you really need to run some Julia code 24/7 (e.g. web server), it’s worth to turn it into a system service using something like systemd. It may sound complicated, but in fact boils down to a one small file and a couple of Linux commands.

Sometimes one of these other functions might not achieve what they’re supposed to, since they’ll be interacting with the internet, so I would also need to be able to have the facility to tell the main routine to wait and try again in a couple of seconds or two if that happens.

Just run in try-catch block, it works well in practice. E.g.:

result = nothing
done = false
while !done
    try 
        result = # invoke function that may fail
        done = true
    catch e
        sleep(5)  # sleep for 5 seconds before restarting
     end
end

#8

Thanks for responding. Sounds like cron is probably a good starting place, and I’ll look at systemd if I start hitting boundaries with cron.


#9

@colintbowers I agree with everyone on this thread that cron is the most appropriate method here.
You mention ‘thousands of times a day’ - I just worked out 86400 sconds a day, so one or two per second is indeed thousands per day.

You could look at a batch system like PBSPro or Slurm
http://www.pbsworks.com/PBSProduct.aspx?n=PBS-Professional&c=Overview-and-Capabilities
PBSPro is open source these days, and can handle thousands of jobs per day. You can have some quite complicated job dependencies. You would have to change the scheduler cycle to be a couple of seconds,
but that should nto eb a problem. I believe the next version of PBSPro is able to handle an even higher throughput of jobs.

ps. If you do launch thousands of processes per day, watch out for zombie processes, or processes which just hang on network sockets. Be prepared to do some regular reaping of non-responsive processes.


#10

Why not just try it out… :slight_smile:

Underneath Julia I/O is libuv and this library does have a timer. When googling I found a somewhat related Julia Timer questions on stackoverflow. The timer object has moved, it’s now in base/events.jl. I suppose that code could be helpful also.

Sorry cannot help further, my knowhow of libuv and @async etc. is a bit limited atm. (But cron is certainly a good/pragmatic way.) Potentially in not-so-soon future I’m also interested to run Julia for longer periods to monitor some sensor values.


#11

@colintbowers I was going to start rattling on about RabbitMQ for your purpose. Which of course does not do what you want. However further reading leads to the Celery project
http://docs.celeryproject.org/en/latest/

It might be worht you having a look there. I have no idea if calling Julia from Celery is easy.
Googling for Julia Celery does turn up some delicious images though!


#12

A simple ‘while true’ loop with a ‘sleep(1)’ will definitely work, I have one running now for more than 2 years as a modelling backend for a web service. The code is at https://github.com/ETC-UA/LAIscript. Cheers


#13

May be possible, but frankly, I would just go with cron or similar.

There are several advantages to this:

  1. if there is a state, and you save it in some transparent format (eg JLD2), you can inspect it from the filesystem,
  2. you can upgrade the code seamlessly if necessary,
  3. unit testing is easier (single run, not longer ones).

#14

Someone else in this thread mentioned they’ve been using sleep(1) in a loop for about 2 years now, so I guess it is feasible :slight_smile:

Thinking about it overnight though, I think cron is the right option, basically for the reasons suggested by @Tamas_Papp below.

Cheers,

Colin


#15

I think I’ll start with cron, but you’ve given lots of interesting things to try out if I (or someone else) starts running into limitations with cron. This thread is turning into quite a good resource page!

Cheers,

Colin


#16

That is very interesting to know! I think for now I’ll use cron, for the reasons that Tamas suggests below.

Cheers,

Colin


#17

Good points, I’m convinced. I’ll start out using cron.

Cheers,

Colin