I had a very frustrating debugging session yesterday. My code was running without errors but producing incorrect results. It took at least two hours for me to find this problem:
x = a + really * long +
+ (and + complex) / expression +
- that * (spans - several)
+ lines * of / code
Can you spot the error? The problem is that the expression terminates after the third line because I forgot to hang a trailing + sign there. The fourth line evaluates without error but doesn’t do anything.
If you’re wondering why I use the convention of prefixing new lines with the operator, the reason is readability. In my opinion it is much more easy for the eyes to parse this:
… than this:
Sadly, Julia’s line break rules encourage the latter style. As a workaround, I usually use the prefix style but add dummy + signs at the end of each line when the expression needs to continue.
My questions for the forum:
Is there a better convention for long expressions which is just as readable as my prefix style but avoids silent line break bugs? Ideally, a missed operator should produce a syntax error.
Would it be possible to have Julia produce a warning for “hanging expressions”, i.e. expressions that do not store results in a variable or return a value from a function?
Out of curiosity, why are hanging expressions allowed in the first place? Is there a use case?
I just tried Sublime Text and VScode with Julia syntax. Neither of them auto-indented a multiline expression, nor de-indented a manually indented expression when a line ended with something other than an operator. Do you know of an editor that helps with this?
Good point. But I wonder if anyone has a practical use case for operators with side effects? After all, Julia has a very strong convention for using exclamation!() to warn when side effects happen in ordinary functions, so I imagine that they are quite rare in operators.
exclamation!() is just a convention, not a compiler requirement. Also, it’s mostly used for functions mutating state of its arguments, but there are other side effects, e.g. almost all IO functions such as println(x), write(io, x) or imshow(im) clearly make side effects, but don’t use exclamation.
If you want a realistic example of operators with side effects, consider C++ style IO:
I always wrap multiline expressions in parentheses to avoid errors like the one you mentioned. I developed this habit after reading PEP8, which says
The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.
Experiments suggest that the compiler does not (yet?) optimize this sort of construct very well for nontrivial x. This differs from, e.g., C++ and Fortran compilers which are designed to spend more effort on optimzation. So I beg to differ with @Tamas_Papp: the compiler does want help here.
So I vote for lots of parentheses. (Presumably that’s what Tamas means by grouped calculations; I’d be shocked if a Lisp aficionado like himself would suggest otherwise.)
At this point, my priority is to make my code readable, and wait for the compiler to catch up. Recently I have had multiple occasions of staring at v0.4 code that I mangled to make it a bit faster in performance-critical parts, rendering it very difficult to read. So nowadays I just write my code as I would like, and optimize the occasional critical part.
Compared to the Lisp family, Julia favors breaking up to subexpressions because assignments to new variables do not indent. Compare
Thank you everyone, there are several useful suggestions in this thread. I’ll probably wrap my long expressions in parentheses from now on. Special thanks to @dfdx for disarming my gotcha questions with great examples.
This thing just bit me. I had no idea what was happening and invested the same hour that @flcong did. I finally figured out what the deal was and then found this thread via Google. I have learned my lesson and will probably do as @Tamas_Papp suggests and write cleaner code.
A colleague had written some prototype VBA code. One monolithic function of >500 lines and was considering porting it to C++ for better speed. I rashly suggested I port it to Julia in “a couple of hours”.
The port was easy: replace End If with end, <> with != etc. and the Julia code ran without error first time. 28 times faster too . But the results were wrong.
My first thought was to step through the code in VBA and in Julia, to see where things diverged. But I came unstuck since the debugger in VSCode kept crashing (I need to file a report). So I was back to debugging via println and it took me half a day to find what turned out to be a mis-translation of:
VBA has an explicit line-continuation marker of space underscore <some code> _. I face-planted when I realised that my semi-automatic translation had simply deleted those two characters without considering whether Julia would see the current line as already complete.
Not sure if anyone would vote for adopting a line continuation marker (say space underscore) in Julia?
There are applications where long lines are hard to avoid and are certainly not code smell. This is the main water balance constraint from a JuMP model of a hydropower system:
@constraints hydromodel begin
Water_Balance[t in TIME, p in PLANT],
Reservoir_content[shift(TIME, t-1), p] +
+ water_inflow[t,p] +
+ sum(up.flow * Water_discharge[shift(TIME, t - delay_d[up.name]), up.name, j]
for j in LINESEGMENT, up in upstream_d[p]) +
+ sum(up.flow * Water_spillage[shift(TIME, t - delay_s[up.name]), up.name]
for up in upstream_s[p]) +
- sum(Water_discharge[t,p,j] for j in LINESEGMENT) +
(In the context of JuMP models we break Julia’s ordinary style conventions in favor of our own conventions more suitable for optimization modeling: e.g. Model_variable, parameter_name, Constraint_name, MODELSET.)
Introducing helper variables to break this equation up would increase the size of the model. Although these helpers might be eliminated in the presolve phase of the solver they would certainly increase model generation time and memory requirements. I also think the model is easier to read when you see all the terms of the balance at once.
I would break that into terms, and in any case use LogExpFunctions.logsumexp.
Implementing nontrivial formulas you are “given” without at least a tiny bit of investment into thinking about their numerical properties is usually a recipe for disaster, or at least preventable loss of accuracy.