There are several problems with the Regex support in Base.
- It uses an old version of PCRE2 (10.30, from 8/14/2017 instead of the released version 10.31 from 2/12/2018)
- It only has the 8-bit PCRE2 library
- It is pretty much hard-coded to only work with UTF-8 strings that are not validated (i.e.
- It does not work with multithreading (both because it has some mutable global data that is not allocated per thread, and because for each Regex pattern, there is no synchronization to make sure that only one thread compiles the pattern or sets up the match data structures).
Currently, I define
RegexStrMatchInterator, and the macro
@R_str, so as not do any piracy, however, unless Base is fixed (or regex.jl and pcre.jl are moved out of base and into stdlib), then it seems like I’d need to become a pirate, in order to not leave people with inconsistent results (race conditions and memory corruption when used with threads).