Hello,
This is identical to this.
In my case, if I want to split a string (for example a quite complex anthroponym, with dashes, apostrophes, etc.), making complex capitalization depending on these punctuation delimiter, I don’t find an easy way to keep all elements in order, as I could in other languages.
MWE:
> split("123.456.789", r"\.")
["123", "456", "789"] # Current and expected behaviour.
> split("123.456.789", r"(\.)")
["123", ".", "456", ".", "789"] # Expected behaviour because the pattern is inside a capture group.
> split("AAA’s BBB-CCC", r"([’'\s-])") # More complex example.
["AAA", "s", "BBB", "CCC"] # Current behaviour.
["AAA", "’", "s", " ", "BBB", "-", "CCC"] # Expected behaviour.
> split("123.456.789", r"\."; keep=true) # Kind of enhancement with retrocompatibility.
["123", ".", "456", ".", "789"]
This is quite similar to this ticket, but the main difference is to keep the delimiter as a separate element among others (but the keep
argument could be a string like separate
, previous
or next
to be useful for all cases…).
First, I planned to open a new ticket on GitHub, but maybe I missed something!
Sincerely.