Regular Expression


#1

Hello,

I am trying to convert my data management code from R to Julia with some minor challenges. In particular, I am trying to use regular expression to extract patterns between two words. For example,

test = "Top Skills & Expertise Project Management Cooking Childcare Tutoring Most Recommended"
where I would like to extract the pattern “Project Management Cooking Childcare Tutoring”.

I am fairly confident with using regular expressions in R, but having some challenges with Julia’s use of the PCRE library.

Any suggestion on pattern extraction and resources for PCRE would be most helpful.

Thanks,

James


#2

Can you post the R code you were using too? It’s not clear to me how you want to identify the pattern.


#3

I have used this site: www.regular-expressions.info, and found it to be useful.

If you had many strings that began “Top Skills & Expertise” and ended “Most Recommended”, one way to get the words in-between is to use replace:

 test = "Top Skills & Expertise Project Management Cooking Childcare Tutoring Most Recommended"
 prefix = "Top Skills & Expertise "
 suffix = " Most Recommended"

 result = replace( replace( test, prefix, "" ), suffix, "" )

The result is “Project Management Cooking Childcare Tutoring”.


#4

rex is a great R package for working for regular expressions in a sane way. I think except for minor differences, the generated regular expressions should copy over to Julia without to much modification? P.S. rex would be a great package to port to Julia; on my to do list.


#5
replace(test,r"(.*) Most Recommended",s"\1")