I had a very frustrating debugging session yesterday. My code was running without errors but producing incorrect results. It took at least two hours for me to find this problem:
x = a + really * long +
+ (and + complex) / expression +
- that * (spans - several)
+ lines * of / code
Can you spot the error? The problem is that the expression terminates after the third line because I forgot to hang a trailing + sign there. The fourth line evaluates without error but doesn’t do anything.
If you’re wondering why I use the convention of prefixing new lines with the operator, the reason is readability. In my opinion it is much more easy for the eyes to parse this:
10
+ 3
- 4
+ 5
… than this:
10 +
3 -
4 +
5
Sadly, Julia’s line break rules encourage the latter style. As a workaround, I usually use the prefix style but add dummy + signs at the end of each line when the expression needs to continue.
My questions for the forum:
Is there a better convention for long expressions which is just as readable as my prefix style but avoids silent line break bugs? Ideally, a missed operator should produce a syntax error.
Would it be possible to have Julia produce a warning for “hanging expressions”, i.e. expressions that do not store results in a variable or return a value from a function?
Out of curiosity, why are hanging expressions allowed in the first place? Is there a use case?
Your editor should indent correctly, and that can serve as a warning.
That said, the best practice is to avoid expressions spanning multiple lines if possible. Group calculations, and make your code more readable. The compiler won’t care, it’s job is to put it together.
Is there a better convention for long expressions which is just as readable as my prefix style but avoids silent line break bugs? Ideally, a missed operator should produce a syntax error.
Yep:
x = (a + really * long
+ (and + complex) / expression
- that * (spans - several)
+ lines * of / code)
Would it be possible to have Julia produce a warning for “hanging expressions”, i.e. expressions that do not store results in a variable or return a value from a function?
I don’t think so. Operators aren’t really different from functions, and functions may have side effects. So when you write:
a + b
depending on operand types, + may be a function, for example, printing its arguments to the screen or adding b in-place to a, or whatever else.
I just tried Sublime Text and VScode with Julia syntax. Neither of them auto-indented a multiline expression, nor de-indented a manually indented expression when a line ended with something other than an operator. Do you know of an editor that helps with this?
Good point. But I wonder if anyone has a practical use case for operators with side effects? After all, Julia has a very strong convention for using exclamation!() to warn when side effects happen in ordinary functions, so I imagine that they are quite rare in operators.
exclamation!() is just a convention, not a compiler requirement. Also, it’s mostly used for functions mutating state of its arguments, but there are other side effects, e.g. almost all IO functions such as println(x), write(io, x) or imshow(im) clearly make side effects, but don’t use exclamation.
If you want a realistic example of operators with side effects, consider C++ style IO:
I am using Emacs, with julia-mode, and if I press RET after the +, the next line starts indented properly. I am pretty surprised that other editors don’t do this.
I always wrap multiline expressions in parentheses to avoid errors like the one you mentioned. I developed this habit after reading PEP8, which says
The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.
This is also a common recommendation in Javascript style guides, as Javascript has “automatic semicolon insertion” which some people find unintuitive.
Experiments suggest that the compiler does not (yet?) optimize this sort of construct very well for nontrivial x. This differs from, e.g., C++ and Fortran compilers which are designed to spend more effort on optimzation. So I beg to differ with @Tamas_Papp: the compiler does want help here.
So I vote for lots of parentheses. (Presumably that’s what Tamas means by grouped calculations; I’d be shocked if a Lisp aficionado like himself would suggest otherwise.)
At this point, my priority is to make my code readable, and wait for the compiler to catch up. Recently I have had multiple occasions of staring at v0.4 code that I mangled to make it a bit faster in performance-critical parts, rendering it very difficult to read. So nowadays I just write my code as I would like, and optimize the occasional critical part.
Compared to the Lisp family, Julia favors breaking up to subexpressions because assignments to new variables do not indent. Compare
Thank you everyone, there are several useful suggestions in this thread. I’ll probably wrap my long expressions in parentheses from now on. Special thanks to @dfdx for disarming my gotcha questions with great examples.
This thing just bit me. I had no idea what was happening and invested the same hour that @flcong did. I finally figured out what the deal was and then found this thread via Google. I have learned my lesson and will probably do as @Tamas_Papp suggests and write cleaner code.
A colleague had written some prototype VBA code. One monolithic function of >500 lines and was considering porting it to C++ for better speed. I rashly suggested I port it to Julia in “a couple of hours”.
The port was easy: replace End If with end, <> with != etc. and the Julia code ran without error first time. 28 times faster too . But the results were wrong.
My first thought was to step through the code in VBA and in Julia, to see where things diverged. But I came unstuck since the debugger in VSCode kept crashing (I need to file a report). So I was back to debugging via println and it took me half a day to find what turned out to be a mis-translation of:
VBA has an explicit line-continuation marker of space underscore <some code> _. I face-planted when I realised that my semi-automatic translation had simply deleted those two characters without considering whether Julia would see the current line as already complete.
Not sure if anyone would vote for adopting a line continuation marker (say space underscore) in Julia?
There are applications where long lines are hard to avoid and are certainly not code smell. This is the main water balance constraint from a JuMP model of a hydropower system:
@constraints hydromodel begin
Water_Balance[t in TIME, p in PLANT],
Reservoir_content[t,p] ==
Reservoir_content[shift(TIME, t-1), p] +
+ water_inflow[t,p] +
+ sum(up.flow * Water_discharge[shift(TIME, t - delay_d[up.name]), up.name, j]
for j in LINESEGMENT, up in upstream_d[p]) +
+ sum(up.flow * Water_spillage[shift(TIME, t - delay_s[up.name]), up.name]
for up in upstream_s[p]) +
- sum(Water_discharge[t,p,j] for j in LINESEGMENT) +
- Water_spillage[t,p]
[...]
end
(In the context of JuMP models we break Julia’s ordinary style conventions in favor of our own conventions more suitable for optimization modeling: e.g. Model_variable, parameter_name, Constraint_name, MODELSET.)
Introducing helper variables to break this equation up would increase the size of the model. Although these helpers might be eliminated in the presolve phase of the solver they would certainly increase model generation time and memory requirements. I also think the model is easier to read when you see all the terms of the balance at once.
But then you are talking about a DSL (JuMP), not Julia. I don’t know about JuMP, but Julia of course has referential transparency so it is not an issue.
VS Code does this, and has decent support for Julia Debugging, and can even run Julia in Jupyter inside a VS Code window.
Visual Studio also does this, and is generally more powerful (i.e., more programable, and a larger extension/app market), but has a larger learning curve.
I would break that into terms, and in any case use LogExpFunctions.logsumexp.
Implementing nontrivial formulas you are “given” without at least a tiny bit of investment into thinking about their numerical properties is usually a recipe for disaster, or at least preventable loss of accuracy.