What’s not fun, is it’s kind of broken… I would want chopsuffix and chopprefix to work, well according to my definition, be able to to do e.g.:
chopprefix(s, "http://", "https://") # here meaning drop/rename http to https
chopprefix(s, ["http://", "https://"]) # in case you want to just drop either
Neither works, but could be added, no breaking change, but my issue is that a straightforward extension is problematic.
As is I can drop “http”, but it will NOT drop “HTTP”, and I checked, neither in Python. But you can expect either. Things get worse.
chopprefix(s, "http://", "https://")
chopprefix(s, "HTTP://", "HTTPS://")
But this would miss some cases such as “hTtP”. I.e. there are exponentially many cases, here 16 to check for if chop functions aren’t made case-insensitive.
I don’t work with strings much, […]
could instead be
if endswith(path, ".jl") path / ".jl" end That’s a really simple example,
That, and status quo, path[begin:end-3]
(and chop functions) would be a code-smell. Because it’s a very non-trivial example, I believe it should be in the standard library, with my definition.
If endswith
were case-sensitive, it would match “.JL” but would then not strip it, unless / were also case-sensitive, so would you want that? And since it’s an operator you can’t have a keyword-argument. Such should be added to chop functions to get the old non-case-sensitive behavior.
On Linux/Unix the ending would be .jl, most often (because people are used to the file-system case-preserving), you can actually have .JL file ending there too, even two different files with only the .jl vs. .JL distinction… When you run (either) Julia doesn’t care. The shebang controls that Julia is invoked, if not done directly.
That’s unlike on Windows, there you could not have both files, only either, also most likely you would do that on Linux, but you must support .JL too, because the file-ending actually controls what program is run, and on Windows both work. That’s mainly why I would want chopprefix
to be case-insensitive.
One “problem”, is if we change the definition, to be wider, is that while that catches more endings and prefixed (good IMHO, so should be done soon, i.e. before the 1.11 code freeze), it would not on 1.10; which likely becomes the next LTS. So you have a documented inconsistency. It’s not the end of the world, Programs would work as is currently done, imperfectly… and while I would say could be backported, I know not wanted…
Another thing, maybe arguably chop should also work across Unicode normal forms. [And maybe do a strip first, if you have a file name and it ends with a space for some reason… this is likely to be most controversial, so maybe not, or at least not be default?]
[One objection I could see is Julia strings are not just for text, they also work for binary, arbitrary illegal UTF-8, I’m just unsure the chop* or startswith
functions would be used then. Some files start with a magic cookie “string”, such as:
List of file signatures
This is a list of file signatures, data used to identify or verify the content of a file. Such signatures are also known as magic numbers or Magic Bytes. Many file formats are not intended to be read as text. If such a file is accidentally viewed as a text file, its contents will be unintelligible. However, some file signatures can be recognizable when interpreted as text. The column ISO 8859-1 shows how the file signature appears when interpreted as text in the common ISO 8859-1 encoding, wit...
SQLite format 3
Then you want to match only it exactly, not sqlite… or SQLITE (though it might not be the end-of-the world?). I just think people wouldn’t use chopprefix
(for such file content), or even startswith, and, for now at least, I’m not suggesting changing the latter, just arguably is should(?) be consistent.]