[ANN] DaemonMode.jl: a package to run faster scripts in Julia

I was using Ubuntu with Julia 1.3 (I saw that 1.3 is not tested on travis, but gave it a shot, anyways).

Update: After re-booting, I tried again and it worked just like you said. Might have been an issue on my side.

Of course it could be modified for distributed computing.

In its current version should not be working, because it do not actually transfer the file by the sockets, only the directory, filename and current parameters. But it is a logical addition in functionality. Sorry for the long delay, I am very busy these days. When I were more free, I expect a new and great version, solving the suggestions.

2 Likes

This is wonderful but could you add the ServerId into your creation so that different julia program can run with different individual server?

POSTSCRIPT: Maybe, I don’t know, make the ServerID the socket port number? Or make the socket port number the ServerId

  • The server is the responsible of running all julia scripts.
julia -e 'using DaemonMode; serve(id=1234)'
  • A client, that send to the server the file to run, and return the output obtained.
julia -e 'using DaemonMode; runargs(id=1234)' program.jl <arguments>

you can use an alias

alias juliaclient1234='julia -e "using DaemonMode; runargs(id=1234)"'

then, instead of julia program.jl you can do juliaclient1234 program.jl . The output should be the same, but with a lot less time.

Following your suggestion, now runargs() and serve() allow to put directly the port (it is optional, 3000 by default). This allow us more flexibility in the port, and to use the port to identify the daemon.
Example (I use 3500, but any port, preferible high for security reason is possible):

julia -e 'using DaemonMode; serve(3500)'

and in the client:

julia -e 'using DaemonMode; runargs(3500)' program.jl <arguments>

However, the idea is to share the server between different clients, but I guess sometimes you would like to have different servers (in different environments or using different CPUs).

Thanks for the feedback.

It would be nice to have the choice between reusing the same module (maybe just the Main ?) or running a file in its own module.

I’m planning to use this module with Irace to do some parameter tuning of my algorithm ; reusing the same module allows me to parse my instances only once instead of at every script I run :slight_smile:

@Mason I am happy to announce that I have update the Package, and now each file run in its own module, avoiding the conflict of names.

After looking for many option, I change the include(fname) to

m = Module()
content = join(readlines(dname), "\n")
include_string(m, content)

and it is currently working!

Also, I want to thanks to the users, specially @gsoleilhac and @Palli, for the suggestions and changes. Now:

New changes

  • The port can be specified in server and client.
  • Each file is run in its own Module to avoid conflict of names.
  • The test have been improved to allow run them in parallel (with different port for each testset).
  • You can now send specific code to be run in the server.
using DaemonMode
runexpr("using CSV, DataFrames")
fname = "tsp_50.csv";

runexpr("""begin
      df = CSV.File("$fname") |> DataFrame
      println(last(df, 3))
  end""")
3×2 DataFrames.DataFrame
│ Row │ x        │ y          │
│     │ Float64  │ Float64    │
├─────┼──────────┼────────────┤
│ 1   │ 0.420169 │ 0.628786   │
│ 2   │ 0.892219 │ 0.673288   │
│ 3   │ 0.530688 │ 0.00151249 │

I am going to submit the package to the official repository, but I would like to know if you agree with the name ‘DaemonMode’ or prefer another name.

11 Likes

Sorry for the delay, but I had very busy (after crazy weeks I have finally get the tenure position at my University!).

I have worked a lot with Irace and parameter tuning, actually I know its authors, if you need any help, do not hesitate in contact with me, I can help you.

In order to reuse the same module, I does not know yet how to select it as optional. Anyway, the default should be to run each one in its own module, because it is more secure.

12 Likes

Congrats @dmolina! I know very well that tenure in Spain is a major achievement…

I am going to submit the package to the official repository, but I would like to know if you agree with the name ‘DaemonMode’ or prefer another name.

If you are going to do it, I say do it with STYLE

DaemonMaster

1 Like

First of all, thank you very much, I needed very much this!

It only works under Ubuntu for me, not under Windows… Did you only test under Linux? Is there a reason this should not work under Windows?

I’m getting (under Windows)

julia --startup-file=no -e 'using DaemonMode; serve()' <filename>
ERROR in line 1: 'could not spawn `'C:\Users\Christian\AppData\Local\Programs\Julia\Julia-1.4.0\bin\julia.exe' -Cnative '-JC:\Users\Christian\AppData\Local\Programs\Julia\Julia-1.4.0\lib\julia\sys.dll' -g1 -O0 --output-ji 'C:\Users\Christian\.julia\compiled\v1.4\DataFrames\AR9oZ_tt90P.ji' --output-incremental=yes --startup-file=no --history-file=no --warn-overwrite=yes --color=no --eval "while !eof(stdin)
    code = readuntil(stdin, '\0')
    eval(Meta.parse(code))
end
"`: operation not supported on socket (ENOTSUP)'

julia --startup-file=no -e 'using DaemonMode; serve()' itself works

Thanks you for your interest. Sorry for the delay.
It is suppose to work in Windows, I have tested by Travis, but not directly (I usually work mainly in Linux). It is strange because it is standard library, and I supposed it works in any OS. I have to check it.
One question, it is not working only when you are running an expression, or also running a complete file? Running files there are more tests, and I guess it should be working better.

1 Like

reread my question again. I should’ve been

`julia --startup-file=no -e 'using DaemonMode; runargs()' <filename>

for running files instead and I only screwed that up when trying out on Windows

Is it possible to shutdown the server after N minutes of idle an start a new one when a new runargs is called?

This would give a trade-off between compiletime and running tasks. Tasks would be automatically handled as batch tasks if they are provided in a given timeframe.

I write to announce a new version, V0.1.1, of my package DaemonMode. The code has been rewritten with the apportation of XeCycle, Páll Haraldsson and Gautier Soleilhac. Thank you, people!

The main changes are:

  • Now the function “include” can be used in the script run by the client to include code in external files.
  • In case of error, the complete stack is shown, this behavior can be changed by the new server parameter complete_stack.
  • The variables are not shared by default running files, yes running directly code. This behavior can be changed by the new server parameter shared.
  • Also, the server have a better behavior if the client close its connection.

I hope you receive the new version, and your experience using it should be better.

9 Likes

Sorry, I did not read that message. Yes, it could be possible, I will use that idea when I implement the remove version. Sorry again for not replying you.

I write you to inform that I have updated a new version v0.1.2 that improve a lot the Exception stack. For instance, with the following file bad2.jl:

function fun2(a)
    println(a+b)
end

function fun1()
    fun2(4)
end

fun1()

Directly with julia:

$ julia bad2.jl
ERROR: LoadError: UndefVarError: b not defined
Stacktrace:
 [1] fun2(::Int64) at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:2
 [2] fun1() at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:6
 [3] top-level scope at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:9
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include(::Module, ::String) at ./Base.jl:368
 [6] exec_options(::Base.JLOptions) at ./client.jl:296
 [7] _start() at ./client.jl:506
in expression starting at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:9

with DaemonMode v0.1.1 it gave:

$ julia -e 'using DaemonMode; runargs()' bad2.jl
LoadError: UndefVarError: b not defined
Stacktrace:
 [1] fun2(::Int64) at ./string:2
 [2] fun1() at ./string:6
 [3] top-level scope at string:9
 [4] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1088
 [5] include_string at ./loading.jl:1096 [inlined] (repeats 2 times)
 [6] #7 at /mnt/home/daniel/.julia/packages/DaemonMode/lrn5P/src/DaemonMode.jl:141 [inlined]
 [7] (::DaemonMode.var"#3#5"{DaemonMode.var"#7#9"{String},Sockets.TCPSocket,Bool,Bool})() at /mnt/home/daniel/.julia/packages/DaemonMode/lrn5P/src/DaemonMode.jl:98
 [8] redirect_stderr(::DaemonMode.var"#3#5"{DaemonMode.var"#7#9"{String},Sockets.TCPSocket,Bool,Bool}, ::Sockets.TCPSocket) at ./stream.jl:1150
 [9] #2 at /mnt/home/daniel/.julia/packages/DaemonMode/lrn5P/src/DaemonMode.jl:89 [inlined]
 [10] redirect_stdout(::DaemonMode.var"#2#4"{DaemonMode.var"#7#9"{String},Sockets.TCPSocket,Bool,Bool}, ::Sockets.TCPSocket) at ./stream.jl:1150
...

However, with DaemonMode v0.1.2, it gives (using colors like in Julia):

$ julia -e 'using DaemonMode; runargs()' bad2.jl
ERROR: LoadError: UndefVarError: b not defined
Stacktrace:
 [1] fun2 at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:2
 [2] fun1 at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:6
 [3] top-level scope at /mnt/home/daniel/working/DaemonMode/test/bad2.jl:9

It is easier to read, in my opinion :-).

9 Likes

New version v0.1.3 and a binary implementation of Julia Client

I write you to inform that version v0.1.3, with the following changes:

Logging output in console working nicely.

The script can use Logging. There are two situations:

  • The messages are written to a external file.

  • The messages are written to console.

Both situations are working nicely. For instance, for the file test_log1.jl:

using  Logging, LoggingExtras

function msg()
    @warn "warning 1\nanother line\nlast one"
    @error "error 1"
    @info "info 1"
    @debug "debug 1"
end

msg()

running directly with julia:

$ julia test_log1.jl
┌ Warning: warning 1
│ another line
│ last one
└ @ Main ~/working/DaemonMode/test/test_log1.jl:4
┌ Error: error 1
└ @ Main ~/working/DaemonMode/test/test_log1.jl:5
[ Info: info 1

while in color:

running with client:

$ juliaclient test_log1.jl
 Warning: warning 1
│ another line
│ last one
└ @ Main /mnt/home/daniel/working/DaemonMode/test/test_log1.jl: 4
┌ Error: error 1
└ @ Main /mnt/home/daniel/working/DaemonMode/test/test_log1.jl: 5
┌ Info: info 1
└ @ Main /mnt/home/daniel/working/DaemonMode/test/test_log1.jl: 6
┌ Debug: debug 1
└ @ Main /mnt/home/daniel/working/DaemonMode/test/test_log1.jl: 7

or in color:

Return error code (useful for scripts)

runargs() returns

  • 0 if the script runs without any problem.
  • 1 if there is any unexpected problem.

By example:

$ jclient hello.jl 
Hello, World!
$ echo $?
0
$ jclient bad.jl 
ERROR: LoadError: UndefVarError: b not defined
Stacktrace:
 [1] fun2 at /mnt/home/daniel/working/DaemonMode/test/bad.jl:2
 [2] fun1 at /mnt/home/daniel/working/DaemonMode/test/bad.jl:6
 [3] top-level scope at /mnt/home/daniel/working/DaemonMode/test/bad.jl:9

$ echo $?
1

Binary version

julia client only send the information by sockets, so it could be implemented in any compiled language. I have used Nim for that (I first tried Rust but it is was not easy to do that, in nim it was surprising simple), and it is available in (GitHub - dmolina/juliaclient_nim: Julia client binary using DaemonMode.jl package).

You can compile it, or downloading directly the binary version (only for Linux and 64bits) at Release v0.1 · dmolina/juliaclient_nim · GitHub.

For instance, with a script that load packages CSV and DataFrame:

$ time jclient_julia test.jl 7days.csv 
...

real	0m1.389s
user	0m0.489s
sys   0m0.333s

While using jclient_julia it takes only (with juliaserver loaded):

$ time jclient_julia test.jl 7days.csv 
...
real	0m0.443s
user	0m0.511s
sys   0m0.292s

The half of second is due to running the julia interpreter.

Using the binary version the time is greatly reduced:

$ time jclient test.jl 7days.csv
...
real	0m0.019s
user	0m0.004s
sys   0m0.007s

To summarise, using the binary client it is faster because there is not penalty due to the load of the Julia interpreter.

12 Likes

New version v0.1.5 with paralelism (multi-tasking)

After a busy weekend, I write you to inform of a new version v0.1.5, with multi-task running of clients.

In previous versions, if you run a a program with take some time, and you try run another one before the first had finished, the second one did not start until the first one was finished.

In version v0.1.5, all programs are run as tasks in parallel, so the second one can be started before the first one is finished.

Because I have consider that the new behaviour is a lot better than previous one, I have set the async mode active by default. However, you can run the server function with the parameter async=false to have the previous behaviour.

# Async mode
$  julia -e 'using DaemonMode; serve(async=true)'
# Sync mode (previous behaviour)
$  julia -e 'using DaemonMode; serve(async=false)'

This have several advantages:

  • You can run any new client without waiting the previous ones.

  • If one process ask for close the Daemon, it will wait until all clients have finished.

  • Normal output (and standard error) are shown always to the corresponding program.

Disvantage:

  • If several clients are running at the same time, @info is shown by the last one. Because @info is usually only used for debugging, I think it is not a big problem.
  • Also, logs messages can be equally sent to the last one, so it is better to redirect logs messages to files.

The main problem in the development was the fact that redirect_stdout and redirect_stderr did not work right with tasks, because they change a global variable are not supported in a multi-tasking environment. Thus, the output (normal and error) was always sent to the last program to be run, and not to the program responsible of the output. For fixing that, I have defined print and stderr to redirect that info manually to the socket (using async code to be able to redirect the output in real-time). That approach cannot be replicated with macros, that is the reasons of the previous disvantages.

I hope you consider it very useful.

20 Likes

This is great, thanks for the wonderful work!

I’m curious about how does the server know which environment to be remembered for which client. Does the server essentially run all scripts from different submissions in one environment?

Thank you for your interest @jling.

Yes, all scripts are run in one environment for avoid loading the libraries more than once, but each one script is run in a different module (created dynamically), to avoid conflict of names.

For print the output and not confusing them between clients, I have also created for each module small functions to define local stdout and stderr.

3 Likes