ANN: ArgMacros v0.2.0 - Flexible, easy, fast command line argument parsing in Julia

ArgMacros is now on version 0.2.0.

ArgMacros continues to be a performant module for parsing command line arguments, and the new version is about ~12% faster to load and run a test benchmark on my machine, compared to 0.1.3.

EDIT: Version 0.2.2 will be available momentarily, and is now ~26% faster than version 0.1.3 (for a simple benchmark script, including Julia start time) due to improved precompile directives. Shoutout to SnoopCompile.jl!

However, with version v0.2.0, ArgMacros now lets you use a consistent interface to declare arguments and receive them:

  • Directly as statically typed local variables
  • In an automatically generated new struct type
  • As a NamedTuple
  • As a Dict

This offers much better performance than ArgParse, and a variety of output formats which can provide static typing and offer improved type information to the compiler, rather than requiring your variables to be stored in a Dict with Any typed values (although you can still do this if you want). Additionally, it provides an efficient and simple interface for declaring your arguments, and allows for some custom argument validation to be done inline.

10 Likes

I’m happy to see that you have developed this. I’ve been very confused as to why ArgParse is so slow (but see below, it may be due in part to Julia). I had actually started working on an argument parser that did NOT use macros, but abandoned it and found that a config file was the most suitable way to handle the majority of the scripts I write.

I was curious as to why ArgMacros and ArgParse both use macros for argument processing. What’s the reason for this vs using a non-macro approach ?

Also, how much of a performance improvement are you seeing over ArgParse ? I have a program where ArgParse was taking so long relative to the run time of my script I actually had to remove it and use the config file approach i mentioned.

However lately i’ve noticed that ArgParse is not taking nearly as long on some programs. Maybe it’s the number and type of arguments that drives the time ??

I’m also wondering if the 1.4/1.5 versions of Julia have somehow improved ArgParse’s performance, because AFAICT, ArgParse hasn’t changed.

So interested to know what you have seen in your testing.

1 Like

1.4 and 1.5 generally have significantly improved package load times. 1.6 also has a ton of goodies here. 2x faster from newer Julia version is completely reasonable (especially if the main problem was invalidation)

invalidation ?

1 Like

From the web page

Julia 1.5 feels snappier than any version in memory, and benchmarks support that impression

absolutely. i noticed it immediately.

p.s. all of the code examples on that page are displaying incorrectly for me under firefox.
This is what I see

f(x::Int) = 1
applyf(container) = f(container[1])

I think the new versions of Julia have helped ArgParse somewhat, but it’s still slower than I’d like, and I prefer the interface of ArgMacros. Also, although I added an option to do this in the most recent version of ArgMacros, I don’t think dumping everything into an Any valued dict is the best way to deal with things because it could interfere with function specialization when passing around an args object, or if your code is inside a main function that will be compiled and could be specialized.

I can’t speak to why ArgParse chose to use a macro based approach, but for ArgMacros, it was so I could pick that name :stuck_out_tongue:
Actually, in ArgMacros it allows for generating the parsing code based on the arguments being used, and sprinkling in static type information.

Here are some performance measurements I got, and a gist for the code I benchmarked with. Feel free to run it on your own system and change some of the setup if you like:

Comparison of argument parsers in julia

All inputs test (all computations in main function with args retrieval)
time julia --project benchmark_foo.jl  -- "TEST STRING F" -deeee 30 3.14 -b=6.28 --cc ArgMacros -a 2
Expected output 133.76 for all scripts

ArgMacros (inline)
133.76

real    0m0.912s
user    0m1.094s
sys     0m0.688s

ArgMacros (struct)
133.76

real    0m1.003s
user    0m1.313s
sys     0m0.656s

ArgMacros (tuple)
133.76

real    0m1.067s
user    0m1.188s
sys     0m0.766s

ArgMacros (dict)
133.76

real    0m1.040s
user    0m1.234s
sys     0m0.625s

ArgParse
133.76

real    0m4.743s
user    0m4.656s
sys     0m0.922s

All inputs test (args passed to separate function as single object)
time julia --project benchmark_foo.jl  -- "TEST STRING F" -deeee 30 3.14 -b=6.28 --cc ArgMacros -a 2
Expected output 133.76 for all scripts

ArgMacros (inline)
N/A

ArgMacros (struct)
133.76

real    0m1.052s
user    0m1.156s
sys     0m0.750s

ArgMacros (tuple)
133.76

real    0m1.025s
user    0m1.344s
sys     0m0.625s

ArgMacros (dict)
133.76

real    0m1.037s
user    0m1.172s
sys     0m0.813s

ArgParse
133.76

real    0m4.341s
user    0m4.438s
sys     0m0.813s

Minimal inputs test (all computations in main function with args retrieval)
time julia --project benchmark_foo.jl  -- "OTHER TEST STRING F" --aa=5
Expected output 170.0 for all scripts

ArgMacros (inline)
170.0

real    0m0.918s
user    0m1.141s
sys     0m0.609s

ArgMacros (struct)
170.0

real    0m1.008s
user    0m1.328s
sys     0m0.656s

ArgMacros (tuple)
170.0

real    0m1.030s
user    0m1.266s
sys     0m0.734s

ArgMacros (dict)
170.0

real    0m1.000s
user    0m1.313s
sys     0m0.656s

ArgParse
170.0

real    0m4.220s
user    0m4.375s
sys     0m0.781s

Minimal inputs test (args passed to separate function as single object)
time julia --project benchmark_foo.jl  -- "OTHER TEST STRING F" --aa=5
Expected output 170.0 for all scripts

ArgMacros (inline)
N/A

ArgMacros (struct)
170.0

real    0m0.992s
user    0m1.391s
sys     0m0.516s

ArgMacros (tuple)
170.0

real    0m0.961s
user    0m1.094s
sys     0m0.656s

ArgMacros (dict)
170.0

real    0m1.037s
user    0m1.250s
sys     0m0.563s

ArgParse
170.0

real    0m4.252s
user    0m4.313s
sys     0m0.828s

Since the arguments were being used in only a very simple way, you can’t totally see the effects of adding type information and allowing specialization, or passing around the args object into separate functions, but you can see that ArgMacros is generally much faster than equivalent code using ArgParse.

On master you can try the hardcore latency reducer:

if isdefined(Base, :Experimental) && isdefined(Base.Experimental, Symbol("@compiler_options"))
    @eval Base.Experimental.@compiler_options compile=min optimize=1 infer=false
end
7 Likes

ok. i’ll test as soon as i get a chance. Meanwhile that 4s you are getting with ArgParse is about what i remember when I was using it, but again, it’s been a while and certainly before 1.5.

of course, my config file approach beats both of them :wink:

I have to admit the ArgParse method of creating the arguments looks clean compared to the @macro interface of ArgMacros. I do very much like the fact that the types are explicitly assigned and I really like the struct creation capability !

I will certainly be trying ArgMacros soon :slight_smile:

Well hopefully you like it :slight_smile:

If I was working on something larger I’d probably look at Comonicon.jl now that that is out because I’ve seen good benchmarks for that but for a smaller project if I just want to get some command line input working I’d use ArgMacros.

I’m not totally sure of the benefits of using struct vs NamedTuple here (struct will be nicer for writing function signatures) but those are definitely my favorite options, other than maybe inline if you’re doing something really simple.

On version 0.2.2 (about to be released) the benchmarks are now at:

Comparison of argument parsers in julia

All inputs test (all computations in main function with args retrieval)
time julia --project benchmark_foo.jl  -- "TEST STRING F" -deeee 30 3.14 -b=6.28 --cc ArgMacros -a 2
Expected output 133.76 for all scripts

ArgMacros (inline)
133.76

real    0m0.748s
user    0m1.031s
sys     0m0.734s

ArgMacros (struct)
133.76

real    0m0.830s
user    0m1.156s
sys     0m0.641s

ArgMacros (tuple)
133.76

real    0m0.807s
user    0m1.109s
sys     0m0.672s

ArgMacros (dict)
133.76

real    0m0.820s
user    0m1.016s
sys     0m0.609s

ArgParse
133.76

real    0m4.292s
user    0m4.297s
sys     0m0.969s

All inputs test (args passed to separate function as single object)
time julia --project benchmark_foo.jl  -- "TEST STRING F" -deeee 30 3.14 -b=6.28 --cc ArgMacros -a 2
Expected output 133.76 for all scripts

ArgMacros (inline)
N/A

ArgMacros (struct)
133.76

real    0m0.836s
user    0m0.953s
sys     0m0.609s

ArgMacros (tuple)
133.76

real    0m0.811s
user    0m0.969s
sys     0m0.703s

ArgMacros (dict)
133.76

real    0m0.865s
user    0m1.172s
sys     0m0.578s

ArgParse
133.76

real    0m4.369s
user    0m4.453s
sys     0m0.875s

Minimal inputs test (all computations in main function with args retrieval)
time julia --project benchmark_foo.jl  -- "OTHER TEST STRING F" --aa=5
Expected output 170.0 for all scripts

ArgMacros (inline)
170.0

real    0m0.737s
user    0m0.922s
sys     0m0.672s

ArgMacros (struct)
170.0

real    0m0.839s
user    0m0.984s
sys     0m0.703s

ArgMacros (tuple)
170.0

real    0m0.816s
user    0m1.063s
sys     0m0.609s

ArgMacros (dict)
170.0

real    0m0.842s
user    0m1.156s
sys     0m0.656s

ArgParse
170.0

real    0m4.229s
user    0m4.375s
sys     0m0.828s

Minimal inputs test (args passed to separate function as single object)
time julia --project benchmark_foo.jl  -- "OTHER TEST STRING F" --aa=5
Expected output 170.0 for all scripts

ArgMacros (inline)
N/A

ArgMacros (struct)
170.0

real    0m0.832s
user    0m1.141s
sys     0m0.656s

ArgMacros (tuple)
170.0

real    0m0.803s
user    0m1.031s
sys     0m0.719s

ArgMacros (dict)
170.0

real    0m0.871s
user    0m0.969s
sys     0m0.766s

ArgParse
170.0

real    0m4.226s
user    0m4.344s
sys     0m0.734s

Also, going off of the benchmarks provided by @Roger-luo in his Comonicon post, ArgMacros v0.2.2 (inline) is (just barely) faster than the latest releases of any of the other options demonstrated:

[zachary@ZACHARY-PC argstest]$ hyperfine "julia example/argparse.jl 2"
Benchmark #1: julia example/argparse.jl 2
  Time (mean ± σ):      3.819 s ±  0.018 s    [User: 3.859 s, System: 0.828 s]
  Range (min … max):    3.789 s …  3.848 s    10 runs

[zachary@ZACHARY-PC argstest]$ hyperfine "julia example/comonicon.jl 2"
Benchmark #1: julia example/comonicon.jl 2
  Time (mean ± σ):      1.181 s ±  0.033 s    [User: 1.243 s, System: 0.672 s]
  Range (min … max):    1.134 s …  1.256 s    10 runs

[zachary@ZACHARY-PC argstest]$ hyperfine "julia example/fire.jl 2"
Benchmark #1: julia example/fire.jl 2
  Time (mean ± σ):     836.2 ms ±   5.4 ms    [User: 1.069 s, System: 0.633 s]
  Range (min … max):   829.2 ms … 847.1 ms    10 runs

[zachary@ZACHARY-PC argstest]$ hyperfine "julia example/comonicon_zero.jl 2"
Benchmark #1: julia example/comonicon_zero.jl 2
  Time (mean ± σ):     623.4 ms ±  12.8 ms    [User: 834.2 ms, System: 617.9 ms]
  Range (min … max):   614.4 ms … 656.3 ms    10 runs

[zachary@ZACHARY-PC argstest]$ hyperfine "julia example/argmacros.jl 2"
Benchmark #1: julia example/argmacros.jl 2
  Time (mean ± σ):     580.0 ms ±   4.6 ms    [User: 844.8 ms, System: 647.0 ms]
  Range (min … max):   575.7 ms … 591.4 ms    10 runs

nice, I’m wondering which part do you find was slow and got improved? maybe I could use some of the ideas too.

The biggest thing is using the optimization level 1 setting for my module (suggested in a PR by @Palli) and adding precompilation directives with heavy guidance from SnoopCompile, which I think Comonicon already does. Some things I think might (with no testing or evidence) improve the speeds of the generated Comonicon Zero code is using local variables to store all of the arguments as they are processed instead of an array, generating precompile directives for the command_main function and user’s main functions maybe, and mainly avoiding the regex matches in favor of direct string checks.

1 Like