String Colon String Would Fit The Language

How is called the colon operator?

Logically, using colons for strings as an abstract operation would fit the language. It is clear for ordered lists, like “a”, “b”, “c”, “d” gives “b”:“c” value “b”, “c”; where ordered alphabet leaves open the exact elements of a list, but for any ordered list it is known. If ranges and elements are separated, the list meaning would be known for any array type, but where range=elements, meaning for this list for a subset dimension would be known for any dictionary.

There are many cases where order of list matters, in case of strings - most simple on the operational level would be dictionary pages, where values also have to be cut in the middle, with layout and size of the string, so that strings are combined with numbers; I don’t know whether it is “string”:12:“String”:13; or “string”:“String”, 12:13 or [“string”::n12]:[“String”::n13] (view Review of the Collins COBUILD Advanced Learner’s English Dictionary | Antimoon). Layouts, which could return such data about their elements, should be supported.

I wasn’t very serious about all of that, but it shows that having simply an object, which contains two strings and a colon between them, and knows it’s simpler use cases, where you give it to the ordered strings container or container of written numbers (maybe including simple syntax for mathematical formulas, even in abstract space), could lead many common cases to much more clarification.

It is also interesting question, how to order dictionary, which has german and estonian letters mixed? Is it the same “Ü” in German alphabet and the “Ü” in Estonian alphabet or is it not? Is this the same word if German and Spanish words have exactly the same letters?

Allowing also multidimensional string dictionaries would not break it. Say, I take a Chemical elements table and express the components I can make with GChemCalc syntax. This will give me a whole infinity of mathematical expressions, but it also matter, which ones I plan to ever use - so I can leave the syntax wholly abstract.

You can overload (::Colon)(x, y) if you want to. Here is how you can define string1:string2 to be an object “holding” all strings that are lexicographically between string1 and string2.

struct StrRange
	first::String
	last::String
end

(::Colon)(s1::AbstractString, s2::AbstractString) = StrRange(extrema((s1, s2))...)

Base.in(s::AbstractString, sr::StrRange) = sr.first <= s <= sr.last

"hello" in "abracadabra":"zebra" # true
6 Likes

Yes this is a philosophical question, something about Citizenship of programming models. You make it first class Cityzen, but

does it matter if it is a large object, or a small element working bloody fast and often removed more or less by a compiler, leaving only a small rudiment? If it is a mathematical object worth several million computations, or is it worth less than one computation (objects, which are removed by compiler in many cases, looking only that they structurally fit there, objects, which are only left there by a compiler sometimes and asserted other times, that they syntactically fit there).

I am not sure about this — I think it is a practical one. AbstractUnitRange and its subtypes were introduced to provide a compact representation for vectors that have a very regular structure, but in practice these are really just vectors.

In contrast, I don’t think that you want to treat eg "abracadabra":"zebra" as an <:AbstractVector, since it has infinitely many elements. If you find @Vasily_Pisarev’s StrRange implementation above useful, just use it in your own code, but note that providing a method for : is type piracy, so I would suggest a different constructor.

3 Likes

Rather than ranges, perhaps a better analogy is to intervals, which have endpoints & membership but aren’t thought of as having an integer length. It looks like this just works:

julia> using IntervalSets

julia> 1.5 in 0..pi
true

julia> "hello" in  "abracadabra".."zebra"
true
8 Likes

(For the other parts, I will reply later - I definitely agree about the type privacy, this was my first thought that this kind of operator must be described in syntactic level, such that it would act like a nothing, the example override I would rather solve with long functions, which give some hint that there is something going on behind this). Rather than using override, I would describe with comments, what the variables are, and give some math - this is basically what a primitive is doing, that it knows it’s math; I mean pi is a pi and you can know that if you resize your screen to million five thousand times five thousand, you will need to check whether your pi works with programs - even worse if it happens late in the history; definitely, in the beginning of history, pi just meant 3.14.

Well this is still assuming that string literals are infinite; for reality this is very often not the case; as an analogue, even the numbers for 0…pi form an infinite set.

For my free time, I have done thousands of little programs, which know, what is a “syllable”.

And thank anyway for discussion exists - I just solved, what to do with my complex number [Colons] / dimensions, so I think this one can also not be unfruitful.