Parsing a string with quotations

I’m writing a small package involving some user interaction in the style of unix command line interactive utilities. The package SimpleArgParse.jl with some adaptations would do the parsing, however it needs to be provided with arguments as a vector of strings (ARGS). The command line arguments are presumably parsed by OS, but for the interactive use I must do it on my own. For the simple cases like
-a AA -b BB
the split(s) function would perfectly do the job. Now, I can have enquoted text like
-f "C:\user folder\"
to be made into
["-f", raw"C:\user folder\"]
and all kinds of special cases that can arise.

I hope (and almost sure) there exist a ready to use solution for that

For interactive use, why not just have the user pass an array of strings via Julia code? Julia is already a perfectly good language for interactivity, so why would you emulate a more primitive language like sh on top of this?

1 Like

In an act of escape gymnastics worthy of Houdini, you can try the following:

julia> s = "-a AA -b BB"
"-a AA -b BB"

julia> readlines(`sh -c "for arg in $s; do printf \"%s\\n\" \"\$arg\" ; done"  0`)
4-element Vector{String}:
 "-a"
 "AA"
 "-b"
 "BB"

julia> s = "-f \"C:\\\\user folder\\\\\""
"-f \"C:\\\\user folder\\\\\""

julia> v = readlines(`sh -c "for arg in $s; do printf \"%s\\n\" \"\$arg\" ; done"  0`)
2-element Vector{String}:
 "-f"
 "C:\\user folder\\"

julia> println(v[2])
C:\user folder\

julia> println(s)
-f "C:\\user folder\\"

The goal is to use the actual command shell to parse the arguments. And it works. The DOS style paths with the backslash directory separator posed an additional challange. But this challange may be present even in ideal cases.

Also, historical note, Houdini actually died from his escape artistry.

There is the undocumented internal function Base.shell_split that does more-or-less what you want:

julia> Base.shell_split("-f \"C:\\user folder\"")
2-element Vector{String}:
 "-f"
 "C:\\user folder"

But shelling out sucks because of all the tricksy escaping, and it’s still not clear to me why you want to emulate this behavior in Julia rather than just using Julia as your interface.

I would like to make sure I understand the context.
You have as input to a function like readline() a string like this:

julia> rl=readline()
-a arg1 -b arg2
"-a arg1 -b arg2"

and you want to get a vector with separate options and arguments and for this the split function is sufficient.

Instead for an input of this type

julia> rl=readline()
-f "C:\user folder\"
"-f \"C:\\user folder\\\""

you would like to get a vector like this

["-f", raw"C:\user folder\"]

It seems that this syntax is incorrect.

julia> ["-f", raw"C:\user folder\"]


ERROR: syntax: incomplete: invalid string syntax
Stacktrace:
 [1] top-level scope
   @ none:1

If so, what should be the correct vector?

maybe this?

["-f", "\"C:\\user folder\\\""]

if it were for an input with options of only one “type” the subdivision could be done with something like this:

function parsearg!(srl)
    srl=split(rl, " \"")
    for i in 2:length(srl)
        srl[i]="\""*srl[i]
    end
    println.(srl);
end

So I guess the problem posed concerns situations where there are “mixed type” options.
Like these?


julia> rl=readline()
-a arg1 -b arg2 -f "C:\user folder1" -c arg3 -g "C:\user folder2"
"-a arg1 -b arg2 -f \"C:\\user folder1\" -c arg3 -g \"C:\\user folder2\""

julia> rl
"-a arg1 -b arg2 -f \"C:\\user folder1\" -c arg3 -g \"C:\\user folder2\""
1 Like

@stevengj, thank you! Base.shell_split should do the job.

Actually I just want to provide a simple way (simple to implement and simple to use) for the user to supply some minimal information. A file path is actually not an intended data: NativeFileDialog.jl would rather be used for that. I’ve just taken a Windows file path as an illustration of a tricky case - actually a too tricky one. Yes, in principle I’m aware of all these “escapism” problems.

@rocco_sprmnt21 - thank you, too - see above.

@Dan - a nice workaround :slight_smile: Just wonder how it would work under Windows (actually I know).