Cleaner more efficient formatted strings for Julia


#1

This is in response to a comment on GitHub by @simonbyrne, https://github.com/JuliaLang/julia/issues/25178#issuecomment-352842624

I’m sad that you haven’t tried my https://github.com/JuliaString/StringLiterals.jl package yet.

I think it is what formatted printing in Julia really should be like. It is clean and extensible, doesn’t use any characters except for the \, and in a legacy version of the macro, can even support the old escape sequences such as $ for interpolation (and to stave off @jeff.bezanson’s concerns about having to type a few extra characters ( ) at the expense of lots of extra \s for many strings, as well as compatibility with most all languages with C-like string literals) that legacy version could be changed (with changes to the FemtoLisp parser) from F"..." (for formatted string) to $"...", which gives a clear indication that the string has $ interpolation).

using Printf, StringLiterals
str = "Speed improvement" ; n = 123 ; a = 0x1000000 ; b = 0x90000 ;
@printf "%-20s %-5d %7.3f%%" str n a/b*100
pr"\%-20s(str) \%-5d(n) \%7.3f(a/b*100)"

will display:

Speed improvement    123   2844.444%
Speed improvement    123   2844.444%

It also should perform better than the @s/printf macros, because it doesn’t have to generate different code for every distinct string with multiple % sequences, and it’s cleaner in two ways, because you don’t have to quote any %s in the format string, nor count the positions of the items being formatted (something I always hated from C-style s/printf) (and note that the quoting of % in a sprintf format string is %%, not \%, a bit of a cognitive dissonance when you are used to ‘’ to quote things that need to be quoted).
Another advantage, is that there’s no separation between literal strings and formatted strings - @s/printf don’t allow any interpolation.

You can see here (https://gist.github.com/ScottPJones/2aa13df74e9c8432ef9b81cdc7db9f2e) the differences in generated code (and I believe I can improve the generation for the pr"…" macro a lot, currently I just use what at-tbreloff (Tom Breloff) provided in https://github.com/JuliaIO/Formatting.jl/pull/10, updated to work with the world age changes)
A very nice feature of Tom’s work, is that you can set type specific defaults for formatting, so that \%(v) can use specific defaults for say, BigFloats instead of Float64, or for Rational{Int} vs. Rational{BigInt}.

The modern Unicode escape avoids the problem of a Unicode escape followed by 0-9, A-F or a-f, and having to choose u or U depending on whether or not the the character is in the BMP or not.
As an example:

julia> "\U1f596For Spock!"
ERROR: syntax: invalid escape sequence
julia> "\U0001f596For Spock!"
"🖖For Spock!"
julia> f"\u{1f596}For Spock!"
"🖖For Spock!"

Finally, I added a number of convenience features, that can easily be made extensible (so they could be loaded from packages, and don’t need to be in a Base or stdlib form of this.

julia> pr"LaTeX name: \<dagger>\nEmoji name: \:smile:\nHTML name:  Jos\&eacute;\n"
LaTeX name: †
Emoji name: 😄
HTML name:  José
julia> pr"Unicode:    \N{COUNTING ROD UNIT DIGIT THREE}\n"
Unicode:    𝍢

This makes it easier to have your code display Unicode characters, Emoji, etc., without having to look up the Unicode codepoint, or requiring the use of an editor that has easy ways of entering those characters.
(note: I don’t know of any that has the tables for the whole set of Unicode characters, such as many of the ones > 0xffff)


#2

Thanks for pointing that out, I hadn’t seen it. That does look something like what I had in mind. I personally would prefer to get rid of the C-style format specifiers (e.g.%-+0.4f) as well.


#3

Well, I wouldn’t get rid of them, but also provide other options.
If you read the README.md on my package, you’ll see it also has some support for more Python style, as well as simply doing \%(value; keywordargs...) (something I borrowed :slight_smile: from Tom’s very nice pull request for Formatting.jl… that bothered me when nobody merged it…). You might prefer that style.