A Pirate's Life For Me

strings
multithreading
regex

#1

There are several problems with the Regex support in Base.

  1. It uses an old version of PCRE2 (10.30, from 8/14/2017 instead of the released version 10.31 from 2/12/2018)
  2. It only has the 8-bit PCRE2 library
  3. It is pretty much hard-coded to only work with UTF-8 strings that are not validated (i.e. String)
  4. It does not work with multithreading (both because it has some mutable global data that is not allocated per thread, and because for each Regex pattern, there is no synchronization to make sure that only one thread compiles the pattern or sets up the match data structures).

Currently, I define RegexStr, RegexStrMatch, RegexStrMatchInterator, and the macro @R_str, so as not do any piracy, however, unless Base is fixed (or regex.jl and pcre.jl are moved out of base and into stdlib), then it seems like I’d need to become a pirate, in order to not leave people with inconsistent results (race conditions and memory corruption when used with threads).

Any advice?

Thanks!


#2

I pirate freely if it is reasonable(according to my view of reasonableness) across the runtime domain.
For example I do a lot of operations with 4x4 homogoeus projection matrices. For that purpose I defined * to handle 4x4 matrix on the left and
and any subtype that arises in the code that is a 3 element vector.
The operation first makes the vector homogenous and performs the * , renormalize and returns the result as 3-vector.

Maybe the best thing would be to have the pirate definitions code in a separate module, and don’t publish that module into the global METADATA

Pirates have rights!
Pirates are people too!


#3

I don’t want to derail the topic (and hope it gets good feedback), but –
TsurHerman,
Hmm? Is this for your own homogeneous projection matrix type? If so, this isn’t piracy. It’s just adding methods for new types. Even if some of the arguments are base types or types from another library (eg, vector).
Type piracy is bad, because if someone runs your code it can suddenly change what other code was doing. If you extended methods to work on custom types, no other code would have had YourModule.YourType beforehand, so nothing will change by suddenly adding methods that include it (plus base and other library types) to the list of methods.
(On that note, if your changes to what other people’s code would do is “functionally equivalent, except no longer has race conditions or memory corruption when running multithreaded”…odds are 0 people wanted race conditions or memory corruption, and will suddenly become flustered and frustrated when they aren’t getting the race conditions and memory corruption they wanted.)

Or are you actually adding new methods to base functions for base types? Like overwriting :*(::Matrix{T}, ::Matrix{T}) where T <: BLAS.BlasFloat?
In which case, that seems weird – why not just create a homogeneous projection matrix type, that let’s you enforce constraints (like being 4x4), and create convenient constructors?


#4

My 4x4 PM is a StaticArrays so size is enforced.
The vector is Point3 Vec3 ntuple etc .
I didnt define these types but I extend .* with these types , so although harmless , this still counts I think as type piracy because it affects the behavior of these types for everyone. Even for modules who do not “see” the module that this was defined at.


#5