Q | Using Regex and eachmatch() to extract substrings

Hello,

I’m using a package [Form 4.3.0] for symbolic code optimization that outputs results in C and have written a script to convert the code to Julia.

It’s worked to-date, however, I’ve run into a new problem in recent use which is summarized in the following MWCE:

# test_string_match.jl

begin
    # c_rhs_str = "x*pow(w[2],4)"
    c_rhs_str = "pow(x,2)*pow(w[2],4)"

    rgx = r"pow\((.*),(.*[^.*])\)"

    regex_match = collect(eachmatch(rgx, c_rhs_str, overlap = false))

    for ith_match in regex_match
        @show pow_substr = ith_match.match
        @show rgx_vrbl, rgx_expnt = match(rgx, pow_substr)
    end
end
c_rhs_str = "x*pow(w[2],4)" = "x*pow(w[2],4)"
pow_substr = ith_match.match = "pow(w[2],4)"
(rgx_vrbl, rgx_expnt) = match(rgx, pow_substr) = RegexMatch("pow(w[2],4)", 1="w[2]", 2="4")

gives the expected result, however,

c_rhs_str = "pow(x,2)*pow(w[2],4)" = "pow(x,2)*pow(w[2],4)"
pow_substr = ith_match.match = "pow(x,2)*pow(w[2],4)"
(rgx_vrbl, rgx_expnt) = match(rgx, pow_substr) = RegexMatch("pow(x,2)*pow(w[2],4)", 1="x,2)*pow(w[2]", 2="4")

does not give the result I thought I would get

pow_substr = ith_match.match = “pow(x24)”
(rgx_vrbl, rgx_expnt) = match(rgx, pow_substr) = RegexMatch(“pow(x,4)”, 1=“x”, 2=“2”)
pow_substr = ith_match.match = “pow(w[2],4)”
(rgx_vrbl, rgx_expnt) = match(rgx, pow_substr) = RegexMatch(“pow(w[2],4)”, 1=“w[2]”, 2=“4”)

with the two pow() parsed into separate substrings.

So I’m wondering if this is a problem with my use of Regex or eachmatch() or both?

This doesn’t answer your regex question, but it looks like Form output is largely valid Julia syntax. Perhaps you need to do a few minor transformations like replace(s, "**"=>"^"), but assuming you have, you could use Meta.parse(s) to convert it to (quoted) Julia code.