Regular expression How do I find a line that contains this rule but also something else?

How do I find a line that contains this rule but also something else?

rx=r"^[0-9]{2}-[0-9]{3}$"
occursin.( rx ,string.(dane[:,6]))

Thx, Paul

Could you maybe give an example line? Something else is not very meaningful in this context. At the moment the line is completely described by the rule since you have start and end.

I think he means that the line should strictly contain this regex and something other than the pattern?
maybe:

rx = r".+\b[0-9]{2}-[0-9]{3}\b | \b[0-9]{2}-[0-9]{3}\b.+"

But this changes the original rule. Maybe .+ in the middle?

@programista you’ve been around since 2016, you know you have to provide more info here.

What is ‘something else’, what is dane?

2 Likes

Like this

695-element Array{Any,1}:
 "201-215"
 "20-215"
 "Ul. Sobieskiego nr. jeszcze nie ma punkt
 "KamieƄskiego 201-215 Brt"
 "Ul wczasowa 8 82-103"
 "Ul.Klifowa 23 72-350"
 "KamieƄskiego 201-215 Bud"
...

Can you please write a bit more? Which parts of which of those lines should be matched?

Try to explain what you want. Show a string that should be matched and explain why. And show a similar line that should not be matched, and explain why.

Please put some more effort into the question. We shouldn’t have to spend a lot of time figuring out what your question means.

2 Likes

This makes me think you might just want to remove ^ and $. With occursin it should match lines that have the described pattern somewhere
Or @evad 's answer. :smiley:

I really can’t recommend highly enough this awesome site for sorting all regex related things:

https://regexr.com

3 Likes
v = [ "201-215",

"20-215",

"Ul. Sobieskiego nr. jeszcze nie ma punkt",

"KamieƄskiego 201-215 Brt",

"Ul wczasowa 8 82-103",

"Ul.Klifowa 23 72-350",

"KamieƄskiego 201-215 Bud"]

rpost = r"^[0-9]{2}-[0-9]{3}$"

# This leaves no opportunity for "something else"

occursin.(rpost, v) # matches only 2nd

r1 = r"[0-9]{2}-[0-9]{3}"

# Fails to not match rows that do not contain "something else"

occursin.(r1, v) # does not match 3rd

r2 = r".*[0-9]{2}-[0-9]{3}.+"

# Fails to match rows where "something else" is on the left

occursin.(r2, v) # matches 4th and last

r3 = r".+[0-9]{2}-[0-9]{3}.*"

# Fails to match rows where "something else" is on the right

occursin.(r3, v) # does not match 2nd and 3rd

r4 = r".+[0-9]{2}-[0-9]{3}.*|.*[0-9]{2}-[0-9]{3}.+"

# I thing this is the answer

occursin.(r4, v) # does not match 2nd and 3rd but more robust

v2 = [v..., "12-234abc"]

occursin.(r3, v2) # fails to match new string

occursin.(r4, v2) # matches new string
1 Like

Thanks , big lesson !