How to parse a string with ranges?

I would like to avoid using eval to parse some user input from the REPL that expects lists of integers and may contain some ranges, as in this MWE:

# INPUT:
str = "1 3:3:12 18 20"

# OUTPUT: 
v = eval(Meta.parse("[" * join(split(str),';') * "]"))

The above is fine with me, but a malicious user may erase the whole disk with this kind of thing, as illustrated here.

So my question is, is there an easy way to parse a string with ranges?
Say, a string like: "3:3:12"

Thank you.

some ideas for inspiration:

julia> a = "3:3:12"
"3:3:12"

julia> Base._colon(parse.(Int, split(a, ':'))...)
3:3:12
1 Like

Thanks @jling, I was not aware of this internal tool.

Using eval is definitely not a suggested path.

But stopping short at parsing like so:

julia> vv = Meta.parse("[" * join(split(str),';') * "]")
:([1; 3:3:12; 18; 20])

julia> dump(vv)
Expr
  head: Symbol vcat
  args: Array{Any}((4,))
    1: Int64 1
    2: Expr
      head: Symbol call
      args: Array{Any}((4,))
        1: Symbol :
        2: Int64 3
        3: Int64 3
        4: Int64 12
    3: Int64 18
    4: Int64 20

and then filtering the resulting expression (maybe macro authoring tools can help) and evaluating safely looks promising.

1 Like

FWIW, a three-liner around @jling’s idea, to replace eval in the OP example (edited with Henrique’s solution, and using unique):

# INPUT:
str = "1 3:3:12 18 20"

# CHECK INPUT: 
!all(isnumeric, filter(x -> x ∉ (':',' '), str)) && throw(DomainError(str, "Only integers>0 and ranges, pls!"))

# OUTPUT
v = reduce(unique ∘ vcat, [(':' ∈ s ? (:)(parse.(Int, split(s, ':'))...) : parse(Int,s)) for s in split(str)])
2 Likes

Maybe checking the length of ranges to prevent allocation of too much memory is also a good idea.

1 Like
julia> Base._colon(parse.(Int, split("3:3:12", ':'))...)
3:3:12

This should not be the accepted answer. _colon is undocumented and underscore-prefixed, it should not be used in production code. A better solution (considering that is okay to throw an exception if the format is incorrect) is:

julia> (:)(parse.(Int, split("3:3:12", ':'))...)
3:3:12
3 Likes

I don’t know if it is among the possible cases, but in the case of input with spaces before and/or after ‘:’

istr=" 2 5 3 :  7 6 2 : 3:  9 12 2  :4  :11 23 1:2:11"

rs=findall(r"\d+ *: *\d+( *: *\d+)*",istr)
rngs=getindex.([istr],rs)
ints=replace(istr,(rngs.=>"")...)
parse.(Int,split(ints))

function parserange(rstr)
    rng=parse.(Int,split(rstr,":"))
    if length(rng)==2  insert!(rng,2,1) end
    range(;zip([:start,:step,:stop],rng)...)
end

parserange.(rngs)

PS

I’d be curious to know alternative regular expressions to the one I found to locate the ranges and, if possible, some expressions to find the integers that are not arguments of the ranges (i.e. close -spaces apart- to ‘:’)