[ANN] OptParse.jl – a composable, type-stable CLI parser

[ANN] OptParse.jl – a composable, type-stable CLI parser

Hi all,

I’ve been working on OptParse.jl, a command-line argument parser for Julia built around three main ideas:

  • type stability: the main goal was to make a CLI library that works well with trimming.
  • composability: an ‘everything is a parser’ approach, where larger CLIs are built from small reusable pieces.
  • parse, don’t validate: validation is embedded in the parser itself, so a successful parse already gives you a valid result.

repo: OptParse.jl
docs: OptParse docs

The design is inspired by libraries like optparse-applicative (Haskell) and Optique (TypeScript), but adapted to Julia’s type system and compilation model.

Quick example

using OptParse

parser = object((
    name = option("-n", "--name", str("NAME")),
    port = option("-p", "--port", integer("PORT"; min = 1000)),
    verbose = flag("-v", "--verbose"),
))

result = argparse(parser, ["--name", "myserver", "-p", "8080", "-v"])

@assert result.name == "myserver"
@assert result.port == 8080
@assert result.verbose == true

The API is organized around a few kinds of “parser building blocks”:

Primitive parsers
match basic CLI structure such as options, flags, positional arguments, and commands

  • input = arg(str("INPUT"))
  • output = option("-o", "--output", str("OUTPUT"))
  • verbose = flag("-v", "--verbose")

Value parsers
Convert raw strings into typed validated values.

These are responsible for turning a matched string into a typed valid value.
They are the validation layer of the system.

  • integer("PORT"; min = 1024, max = 65535)
  • choice("MODE", ["debug", "release"])

Constructors
Combine smaller parsers into larger ones.

  • object(...) for named collections of parsers
  • or(...) for alternatives

This is what makes subcommands and larger application parsers ergonomic to express.

Modifiers
Adjust parser behavior, for example by making something optional or repeatable

  • default(p, value)
  • optional(p)
  • multiple(p)

Bigger Demo

module HelloWorld

using OptParse

const hello = command("hello", object((;
    cmd = @constant(:hello),
    name = option("-n", "--name", str("NAME")),
)))

const goodbye = command("goodbye", object((;
    cmd = @constant(:goodbye),
    name = option("-n", "--name", str("NAME")),
)))

const parser = or(hello, goodbye)

const Hello = resulttype(hello)
const Goodbye = resulttype(goodbye)

runaction(x::Hello) = println(Core.stdout, "Hello, $(x.name)!")
runaction(x::Goodbye) = println(Core.stdout, "Goodbye, $(x.name)!")

function @main(args::Vector{String})::Cint
 obj = argparse(parser, args)
 isnothing(obj) && return 1

 runaction(obj)
 return 0
end

end # module HelloWorld

and then after compiling with juliac

$ helloworld hello --name OptParse
Hello, OptParse!

$ helloworld goodbye --name OptParse
Goodbye, OptParse!

Extensibility

The package is extensible in design, but today new parser families and value parsers still need package-level integration to preserve type stability and trimming behavior.

Current status

This is still experimental and under active development. This means a lot of churn.

Next steps:

  • automatic usage/help generation (ongoing)
  • some API polish and changes
  • broader real-world validation
  • extra parser types that are still missing
  • extra value parsers

Feedback welcome

The user-facing layer is still intentionally a bit minimal; I’d rather add convenience APIs based on actual usage than guess wrong too early.

I’d especially like feedback on:

  • API ergonomics
  • dispatching mechanism of the parse result
  • readability of parser definitions for medium/large CLIs
  • expected help/usage behavior and style
  • missing parser/value-parser combinators

Moreover, the combinator surface is large enough that real-world stress testing would be especially valuable.

Acknowledgements

This library has very few dependencies but those few have been essential:

  • ErrorTypes.jl
  • WrappedUnions.jl
  • Accessors.jl

Thanks for the amazing work on these!

That’s it, hope you’ll like it!
Cheers

This looks quite nice!

Two little nitpicks (while its still experimental):

  • I think object is too general of a name; in the context of a bigger program it won’t be immediately obvious that it has to do with the parsing portion. In your example, naming the variable parser helps; instead could that be the name of the function? p = parser(...)?
  • OptParse vs argparse seems like a mismatch, why not OptParse and optparse ?

Nice library, I’ve been trying hard to prevent myself from writing one with a similar scope for the past few weeks, I’m happy that someone did it!

Why not optparse? Because I just got so used to argparse that i never even thought about using optparse instead which makes much more sense. Thanks for the fresh set of eyes.

Regarding object, I agree that it is a bit too generic, I think that in the context of the library in a vacuum it does make sense, but you raise a good point that in more complex codebases it could be a bit confusing. I kind of like parser from a user perspective but from an internal point of view I’m not sure I’d like to have such ambiguity, but probably I’m just biased having spent too much time with this. Thanks for the feedback, much appreciated!

Congratz on your package! But I do have to ask: Why a new package, and not an improvement PR to ArgParse.jl? Docs

Because this package has a foundamentally different design. ArgParse is macro based, while this has a compositional/functional approach. There are just too many differences for this to be a simple improvement PR to another package.

I like Haskell and optparse-applicative a lot, so am very pleased to hear about this!

Are NamedTuples meant to be the ‘blessed’ interface? One nice thing about optparse-applicative is that you can define e.g. some config struct

data Config = Config {
    port::Int -- ...
}

and then parse directly into that. Is there an easy way to do that if I define something like this in Julia?

struct Config
   port::Int # ...
end

I know that ultimately NamedTuples and structs are really the same thing, but structs are easier to dispatch on and to reason about since they can be documented & their fields are easy to see, etc.

I’m wondering, in particular, if there’s an easy way to derive parsers for structs based on some introspection. I wouldn’t be surprised if Accessors (or ConstructionBase?) has some code for that already.

Glad you like this!

Very good question. One of the benefits that having such a composable design buys you is that this is basically a DSL, that you can sort of easily lower other syntaxes into. There is definitely a lot of unexplored design space on higher level wrappers for this that I haven’t had the time to properly explore yet.

Your struct idea is something that I’ve been playing a bit as well, especially for objects. But one problem I’ve been facing is that it’s easy to get the return type of a parser, but very hard to get the parser from the return type, since the return type is only a small aspect of a parser, and you’d have to find a way to cram all that behaviour information into and around the return type itself, which probably Haskel has a way of doing that Julia does not (that i know of), other than some custom syntax via macros.

You can definitely define a macro that can be used to go from something like this:

@magicmacro struct MyOptions
    "NAME: this is the help text"
    name::String -- option("-n", "--name") 
end

back to the actual parser, but I am not sure that buys you much more than just going:

const _MyOption = (
    name = option("-n", "--name", str("NAME");
        help = "this it the help text")
)
const MyOption = resulttype(_MyOption)

with which you can then dispatch on MyOption just as easily (almost, having anonymous named tuples types means that you can’t distinguish two structures if they have the same fields. This is the sole reason by the @constant parser is a thing, it’s basically a way to tag the anonymous struct).

There’s definitely a lot left to explore though!

Has anyone tried my package ComposableCommands.jl yet? It’s not a parser per se. It’s more like a helper to construct complex commands in a structured way and intepret them into Julia Cmd type.

Thanks, @cshen! I guess I’m mostly thinking about how to replicate the applicative functor part of it, which is imo what makes those libraries so elegant. Fundamentally we’d need a function A -> B -> Object which can then be lifted to get Parser A -> Parser B -> Parser Object, and that’s very natural in Haskell because constructors of record types (i.e. structs) have exactly that signature, but it might be a bit more awkward in Julia because of the lack of currying (I think it should still be possible though). Right now in the examples you specify what’s essentially a NamedTuple that holds Parser A and Parser B and then OptParse joins those to get a Parser (NamedTuple((:a, :b)), but I wonder what’s the missing bit of info that we need to instead join them into a Parser Object, and that has to be somehow related to the constructor of Object, but I haven’t thought deeper than that. (My type notation here is a bad mishmash of Haskell and Julia, sorry!)

As it happens, I was thinking about this the other day as I just started an OCaml project (unsurprisingly the main CLI parser cmdliner also uses applicative-style parsing) and wondering whether there was something similar in Julia so this is very timely, excited to see where you go with it :slight_smile:

You might want to have a look into GitHub - JuliaServices/StructUtils.jl · GitHub to get a macro that takes care of the magic without having to put your own hands into macro stuff!

That’s certainly interesting and promising, I’ll keep this in mind, Thanks!

Looks great. I made something similar a while ago: GitHub - RomeoV/TrimmableCLIParser.jl · GitHub

Hi, it’s been a while. Let’s dive in:

OptParse v0.3.0

This is the first breaking change! Here’s a quick rundown of what changed.

1. Public API rename and cleanup :bicycle: :hut:

This is where most of the breaking comes from. Hopefully this new set of names makes the code read better and more intentional.

  • argparse/tryargparseoptparse/tryoptparse
  • objectrecord
  • multiplemany (zero or more) / many1 (one or more) / repeated (custom)
  • gateswitch
  • resulttypevaluetype
  • various public keyword args were all normalized to snake_case :snake:

Ok, this was the boring part. The big first new thing is

2. Help Metadata and Automatic Help Generation :sos_button:

Help metadata

After… check notes… 7 attempts I finally managed to get the help system in a decent enough shape to be put to the test.

The help information is not embedded inside the single parsers but instead lives on its own and can be attached to whatever parser you want. This allows composability and reusability of smaller parser without having to juggle multiple descriptions

const reuse = option("--name", str("NAME"))

const p1 = record((
    name = reuse |> help("description relative to p1")
))

const p2 = record((
    name = reuse |> help("description relative to p2")
))

Another benefit of keeping the help metadata separate is that for parsers that require big descriptions, you can move the prose away from the parsing logic, which helps to keep things terse.

you can stack help messages, and they will just be merged together with the latest application overwriting previous ones.

The help metadata is gathered and then used to construct a help message when requested, like so:

Gitlike

A non-trivial git-inspired CLI used to stress `OptParse` parsing, help
generation, and trimming.

The example intentionally mixes global options, nested subcommands, repeated
arguments, and mutually exclusive groups.


Usage: <COMMAND> [ARGS...] [OPTIONS]

Commands:
   status      Show working tree status
   add         Add file contents to the index
   commit      Record changes to the repository
   clone       Clone a repository into a new directory
   push        Update remote refs
   remote      Manage configured remotes

Options:
   -C <PATH>           Working directory
   -c <KEY=VALUE>      Config override
   [--paginate]        Paginate
   [--no-pager]        No pager
   [--version]         Version

Examples:
  gitlike status --short
  gitlike commit -m "initial import"
  gitlike remote add origin https://example/repo.git

Which brings me to

Automatic Help Generation

Automatic help generation comes mainly in the form of an application level entry point runparse(parser, argv; ...). This new entry point wraps the basic optparse entry point that takes care of the actual parsing semantics and adds a small layer of CLI policy on top of it:

  • lexical help flags such as --help: prog cmd --help
  • an optional top level positional help command: prog help cmd subcmd
  • customizable behaviour for bare invocation: “what does prog do”?

If more control is required, there is the lower level API available that allows you to

  • generate the raw string: generate_help(parser, argv; progname=...)
  • print the help as needed: print_help(io, parser, argv; progname=...)

If you want positional help explicitly somewhere inside the parser tree, there is a helpcommand() helper that parses invokations such as help cmd subcmd into a HelpRequest, which can be used to render a desired focused help as needed.

How do you parse into a HelpRequest? Great question thank you for asking:

3. Typed Parsers and Construction :hammer_and_wrench:

Anonymous named tuples are still central to OptParse composable model, but there is now a much better story on how to construct and dispatch on named application types.

dynamic construction: construct.

struct ServerConfig
    host::String
    port::Int
end

parser = construct(ServerConfig, record((
    host = option("--host", str("HOST")),
    port = option("--port", integer("PORT"))
))

Under the hood simply delegates to StructUtils.make. In a normal Julia runtime, that means that it can use the full lifting machinery and behaviours from StructUtils to construct custom types.

As a small bonus that came for free from the composability is the ability to construct parametric types directly with the correct type inferred from the result type of the parser.

struct Point{T}
    x::T
    y::T
end

parser = construct(Point, sequence(arg(integer()), arg(integer())))
#::Point{Int64}

Exact construction: construct_exact

Unfortunately all the introspection going on in StructUtils make it not very amenable to trimming.
But you can opt in (pun intended) into stricter semantics with construct_exact(T, parser). This path requires exact shape agreement between the parser output and the target type (checked during construction (pun not intended) of the parser).

  • for record children, field names and order must match the struct exactly
  • for sequence children, positional arity and types must match exactly
  • the target type must be concrete

This provides a narrower surface for the verifier to check and therefore be happy about.

This helps a lot the dispatching story after the parsing is done. Before you had to do

const parser = ...
const MyType = resulttype(parser)

doaction(r::MyType) = ...

where the actual shape of MyType was mixed within the parser behaviour and hard to reason about. Now you’d do

struct MyType
    ...
end
const parser = construct(MyType, ...)

doaction(r::MyType) = ....

which is much more declarative and nicer to read and understand what’s going on.

Ok, but that looks like a lot of typing, And what if I assume the wrong return type for one of my parser fields?

4. @parser macro :screwdriver:

This is a very lightweight macro that helps with setting up a parser that construct a user defined type automatically, taking care of the actual type of the fields.

some_complicated_subparser = or(...)

mytype_p = @parser MyType begin
    "Some description"
    anoption = flag("-f", "--flag")
 
    "Another option"
    branch = some_complicated_subparser

    "Some args for good measure"
    positional = many(arg(str()))
end

this lower to roughly something like

struct MyType
    anoption::valuetype(...)
    branch::valuetype(some_complicated_subparser)
    positional::valuetype(...)
end

mytype_p = construct_exact(MyType, record((
    anoption = flag("-f", "--flag")
    branch = some_complicated_subparser
    positional = many(arg(str()))
)))  

This throws away a bit of flexibility for ease of construction and terseness. It’s intended for simpler parsers that wouldn’t really make use of all the composability features, so that the happy path is nice and smooth, while still keeping the ability to construct more complicated and custom parsers as needed.

That’s it, apologies for the long read, and as always: Feedbacks and complaints are welcome!

That looks so great! I’ll have a good occasion to try it soon. :slight_smile: