Unicode: a bad idea, in general

image

Can you find the \rho? And not confuse it with p?

4 Likes

As far as I can tell, using l is, in general, also a bad idea. Maybe the ascii alphabet should be pruned?

17 Likes

Haha, your crusade continues!

Perhaps this is a font problem, not a Unicode problemā€¦

38 Likes

I suggest removing a and o and l and I from the list of acceptable symbols. Probably best to also disallow O, 0 and 1, too, just to be sure.

14 Likes

Not sure I agree. It also depends on what font you use.

I can clearly tell where the rho is. Also I would avoid l like others have said. Use \ell instead.

3 Likes

Precisely, and there is very little control people have in general (github, vs your local editor, vs ā€¦).
Meaning, potential for confusion is considerable.

1 Like

We could introduce reasonable replacements, that are better to distinguish. 0 gets šŸ¦„, 1 gets šŸŒˆā€¦? I am :rainbow::unicorn::unicorn:% in for that.

I think it is really a font problem, and maybe one solution is to either use Ļ in a certain part of your code or p, but not both (too close to each other).
Or one uses speaking variables (instead of those from the paper) instead. Though for points p,q I also prefer to use p and q. But maybe the rho has a nice different constant name?

13 Likes

I usually write cue for q so as to disambiguate from b,p,Ī“,9,d,Ļƒ,6,Ļ,āˆ‚. Although when writing multi-threaded code I am sure to instead use queue to stay on-theme.

4 Likes

For my case, UTF-8 is just amazing! I use it all the time with very good results.

For example, take a look at this equation used in the SGP4 orbit propagator:

This is the FORTRAN implementation:

This is the Julia implementation full of UTF-8 symbols:

You can see how close we can make the programming language to the mathematical notation. It turns out that it greatly simplified debugging and reduced the errors. I try to avoid, however, UTF-8 symbols in the API (like in keywords) but I am also starting to change my mind even in this point.

34 Likes

Why arenā€™t you using nllā‚€ instead of nā‚€ā€²ā€² or nā€²ā€²ā‚€? (ā€² is \prime)

5 Likes

I think your last point is the important one:

For a script you are running, this is totally fine - maybe provide enough reference where the notation is from (i.e. the paper) then it should be fine.

For an API, e.g. keyword arguments, I think this is not too much recommended, unless really well documented. One might tend to obtain ā€œmagic-symbolsā€ otherwise, that you have to learn. Speaking keywords are then (often, not always) better.

9 Likes

This seems like style guide material

For me being able to write \rho and it renders as Ļ in Julia code is really a Godsend. When working with equations based on physical models, where each variable has a symbolic but also physical meaning, equations just stand out so much clearer.

Being able to look at the equation in paper and find it very faithfully represented in Julia code is so helpful for not losing touch with the physics. And I also like :ocean: for when I write some code using waves etc. :slight_smile:

I think it is really a nice to have and there is no reason to remove Unicode.

Of course bugs, issues, misunderstandings can occur, but they can also happen without it. I have not heard of any terrible example as of now where not being able to spot a rho, leads to 1 month of debugging etc.

Kind regards

5 Likes

The problem described here is, not inherent to the Julia language but a problem inherent to physics and math in general. Where you often have the problem of not being able to easily distinguish certain letters. Especially when it comes to handwriting aswell. Just as an example we would often write Poisson Brackets {} with the Greek letter Xi. If you have ever written that by hand you will quickly see how fast the Xi deterioates into some weird squiggle. Similar things with i and j when written as subscripts.

There is no inherent solution to this problem that is not going against the established practices in Physics and Maths. So the best way is imo to try and avoid where reasonable and otherwise just always make sure you know that there are easily mistakable variables. Possibly use a subscript on one of them.
Otherwise @Ronis_BR has the best explanation for why this is a good feature in general.

6 Likes

I think the issue has little to do with Unicode symbols, and more to do with single-letter variable names. It is of course tempting to write (say) \gamma to indicate that stepsize parameter: itā€™s so close to the math equations in that one paper describing the algorithm! Except, maybe, that other paper uses \alphaā€¦ then maybe stepsize is a better name after all :confused:

I am guilty of these mistakes myself, but the more I read code, the more Iā€™m convinced that it should not necessarily look like ā€œthe math from the paperā€: the math from the paper has the whole paper around it, providing context to the symbols. Code usually doesnā€™t, and expecting it to be readable when sprinkled with as or \alphas is a bit too much for me.

7 Likes

I know what you are saying, but part of that mess in Fortran is just Fortran being Fortran (2.0d0 instead of 2). And it still takes five lines in Julia compared to six in Fortran. Breaking up the expression might be a good idea anyway.

In short, I believe there is virtue in avoiding fancy symbols, which may or may not end up being displayed correctly. Just a short while ago, there were still places in the PDF documentation of Julia where Unicode symbols were either unrecognizable or entirely missingā€¦

1 Like

I am not convinced. Just to type the thing I have to employ four keystrokes instead of three. And, auto-completion does not work for \rhoā€¦

That would actually make a nice feature for the REPL and VSCode extension, if one could toggle the unicode and \symbol representation of code.

2 Likes

if thatā€™s bad, I invite you to see:

image

(idk if itā€™s bad but I certainly donā€™t want to make my pkg impossible to contribute to for outsiders due this, itā€™s not the hill worth dying on)

3 Likes

But this is picking one of the best possible cases for using fully-written-out variable names. stepsize is short and sweet, and likely to be used only a single place in your expression.

Often, variables have no clear ā€˜meaningā€™, as such, with any well-recognized, concise name. Which do you prefer of these two, for example:

tanh(Ī²*d/2) / tanh(Ī±*d/2) - 4Ī±*Ī²*k^2 / (k^2 + Ī²^2)^2

or

tanh(square_root_of_wavenumber_square_minus_angular_frequency_squared_over_shear_velocity_squared * thickness/2) / tanh(square_root_of_wavenumber_squared_minus_angular_frequency_squared_over_longitudinal_velocity_squared * thickness/2) - 
4*square_root_of_wavenumber_squared_minus_angular_frequency_squared_over_longitudinal_velocity_squared*square_root_of_wavenumber_square_minus_angular_frequency_squared_over_shear_velocity_squared * wave_number^2 / (wave_number^2 + square_root_of_wavenumber_square_minus_angular_frequency_squared_over_shear_velocity_squared^2)^2

There are some other ways of doing this, but without unicode, most would probably opt for

tanh(beta*d/2) / tanh(alpha*d/2) - 4alpha*beta*k^2 / (k^2 + beta^2)^2

and that is less readable than the first, and no more explicit.

I think this is really quite simple: Use unicode symbols when it makes your code more readable.

14 Likes