Can I input an invalid string into a variable?

Hi!

Suppose I have string such as:

Part_\d{4}

Putting it into a variable will give:

ERROR: syntax: invalid escape sequence

But if instead do:

Variable = r"Part_\d{4}"

Which works, which is great. But what if I want that to be a function input? So like something like:

function GoodIdea(strVar)
     return r strVar
end

Can I do this? Thanks.

Kind regards

julia> a = "Part_\\d{4}";

julia> println(a)
Part_\d{4}
3 Likes

See escape sequences … this is not specific to Julia. Every programming language with strings has some kind of escaping, usually very similar.

5 Likes

I see, it is just nicer for a user of my program to not have to have knowledge about escape sequences and just input it as usual. Regex can get ugly pretty quickly if we have to escape everything.

This is why I hoped there was a function to do r"Part_\d{4}" but since I cannot hold the variable in a string it is meaningless unless someone can do macro tricks to solve it.

Kind regards

julia> var"Part_\d{4}" = -3
-3

julia> abs(var"Part_\d{4}")
3
help?> @var_str
search: @var_str

  var

  The syntax var"#example#" refers to a variable named Symbol("#example#"), even though #example# is not a valid Julia
  identifier name.

  This can be useful for interoperability with programming languages which have different rules for the construction
  of valid identifiers. For example, to refer to the R variable draw.segments, you can use var"draw.segments" in your
  Julia code.

  It is also used to show julia source code which has gone through macro hygiene or otherwise contains variable names
  which can't be parsed normally.

  Note that this syntax requires parser support so it is expanded directly by the parser rather than being implemented
  as a normal string macro @var_str.

  │ Julia 1.3
  │
  │  This syntax requires at least Julia 1.3.

**edit: sorry, I misunderstood the question, this won’t address your issue

1 Like

String macros help with this. I’m not entirely clear on why you can’t just use r"..." if you want regexes, but if you want an ordinary string containing a regex (or a portion thereof) you can just use raw strings, e.g. raw"Part_\d{4}".

4 Likes

But if I want an user to input “Part_\d{4}” into a function, how would I allow him to do that?

Say I want to do something like this:

function BadString(varStr)
     return raw(varStr)
end

So that the user does not have to deal with '' etc. for all kind of symbols.

Kind regards

Tell the user to pass raw"Part_\d{4}"?

If the users are calling your function from Julia, then presumably they will have to learn at least some Julia syntax.

It would be helpful to have a little more context to understand what you are trying to do.

3 Likes

I think you actually just answered what I wanted to do :slight_smile: You are right in just informing the user about inputting raw infront of the string, I thought that I could do that step for them - which turns out is not possible easily.

Kind regards

If they’re inputting Julia expressions you could just have them give you a regex object which has the syntax you want. If they’re inputting raw text from a prompt or a file then you can implement whatever escaping rules you want.

1 Like

I think it is the second part of your statement I don’t know how to do. If an user provided me with:

Part_\d{4}

I cannot see how I would save this input into a variable etc. Do you mean by reading the data in from text file? The reason I struggle is, I know it is not possible to write:

var = "Part_\d{4}"

But I know it is possible to do:

var = raw"Part_\d{4}"

Maybe my understanding is flawed / I struggle to explain it, I guess I just want to be able to take a string I know has faulty syntax, save it in a variable and then perform “raw” on it.

Kind regards

You say you know about raw, so is this like the use case you expect:

julia> s=readline(stdin);
Part_\d{4}

julia> @show s
s = "Part_\\d{4}"
"Part_\\d{4}"
1 Like

There is no way to “perform” raw, nor string Part_\d{4} has invalid “syntax”.

The string literal "Part_\d{4}" has invalid syntax, this does not mean that Part_\d{4} is an invalid value for a string to have, just that if you need to input this string value in code with a “normal” string literal then you need to type "Part_\\d{4}" instead. This string literal represents the string Part_\d{4} which is entirely valid.

raw"string" is not something that is “applied”, it is just a way for you to say to the compiler: “oh, and this string literal here, use this another set of rules for parsing it into a string value, a set of rules in which \ is almost always interpreted as just \, and nothing more”. It does change the string value that you get from a string literal but just because now you are not, in fact, using a “normal” string literal but a raw string literal that follows other rules.

As @Jeff_Emanuel pointed out, if you do not need to type it in code, then this basically does not matter, the inputter can write anything without caring for those rules and in the end you will get exact what you put in. The only fault of Jeff’s example is that show uses the string literal notation to show the string, so it seems like it does have two backlashes but this is not the case:

julia> s=readline(stdin);
Part_\d{4}

julia> @show s
s = "Part_\\d{4}"
"Part_\\d{4}"

julia> println(s)
Part_\d{4}
1 Like

What I’m trying to get at is are they “inputting” this as Julia code or as some text that you read from somewhere (prompt, text file, etc.)?

3 Likes

Oh, thanks for clarifying!

I would ideally like both. I have made a function which reads files based on a specific Regex:

##Lists files in directory and only returns applicable files, ie. "Part_XXXX.bi4"
function _dirFiles(rgxPat::Regex=Regex("Part_\\d{4}.bi4"))
    files = readdir()
    #Operation on dirFiles instantly
    filter!(x -> occursin(rgxPat, x), files)
    return files
end

(The reason path variable is not used in readdir, is because I “cd” into specific folder first)

I would like an user to be able to both use it from terminal, but without having to think about syntax i.e. writing in terminal or in a script file:

cd("some/path")
Files = _dirFiles("Part_\d{4}")

Without having to think about escaping, and being able to input a Regex string “directly”.

I know understand that I can just ask the user, “remember to put raw” infront - and that is probably what I would do at the end.

I was just curious if there was a way to go around this and do the raw for the user in the function.

Kind regards

I don’t think it’s a good idea to try to help your users like this. If you don’t require the user to specify whether the input is interpreted as a regular string, a “raw” string or a regex, how will you know what they intend? Maybe they didn’t want regex search, but actually want to find the sequence of characters ‘Part_\d{4}’.

If the users are writing Julia code, and are supposed to supply regexes, it is better to tell them to actually create a regex (it is really super easy and useful) rather than secretly turning their input into a regex behind their backs. By trying to make it easy for them, you might instead make things inscrutable and confusing, and even make some things impossible to do.

6 Likes