Digression on short variable names

Great extension! I think between Module and let it is also okay locally within a function (see above sim::Simulation – maybe even just s), of course only for not-too-long functions, but that is something one should anyways aim for :slight_smile:

Type annotation on concrete types is very descriptive and gives users of your function nearly perfect information of what they can use it for. The drawback, if you’re just using it to hint what sort of object something is, is that now you’re locked into a specific type of object—possibly before you’ve thought about what the type hierarchy should be. So it can sometimes be a premature optimization.

If you haven’t figured out the type hierarchy yet, it can often be better to avoid trapping yourself like this; and that’s what Julia’s philosophy of method genericism is about. In those circumstances I think it’s good to use verbose names for function arguments, then internally the name can be shorter. For example:

run!(simulation) = let s=simulation
    while !completed(s)
        step!(s)
    end
    s
end

This way, external users can still get a good idea how to use your function from the argument names, but internally you’re not forced to repeat over and over a long name.

Either that, or default to declaring functions of abstract types such as run!(s::AbstractSimulation), so that when you inevitably decide a month later that you want to fill out the type family (with BatchedSimulation and SequentialSimulation and every other simulation type you didn’t think about the first time), you can just add them as subtypes and not break everything.

7 Likes

I love that!
I’ve been experimenting with let for internal aliases lately
(re also: Brian Chen’s mention in Slack of Swift’s ‘argument labels’, where you can have different names for the public API and for internal use).

Does anyone know if multiline variable bindings are possible with let ?
i.e. sth like

simulate(model, parameters, and, more, names) = let (
    m = model,
    p = parameters,
    ⋮
)
    …
end

I’ve been trying different possibilities with ( ) and ;, but they all give a syntax error :smile:


Edit: ofc you can just write

function simulate(model, parameters, and, more, names)
    m = model
    p = parameters
    ⋮
    …
end
1 Like

You can separate bindings with commas:

julia> let a=1,
           b=2,
           c=3
           a+b+c
       end
6

Note that there must be at least one binding on the same line as let; this doesn’t work:

julia> let
           a=1,
           b=2,
           c=3
           a+b+c
       end
ERROR: syntax: invalid assignment location "2" around REPL[181]:2

Because they evaluate left-to-right, later bindings can depend on earlier bindings:

julia> let a=1, b=a+1, c=b+1
           a+b+c
       end
6

See docs for short description.

indeed, although if you make a habit of using this style, for locally-declared functions you can sometimes run into trouble with captured variables so you have to be a bit more careful with managing your local namespace. Declaring bindings with let allows you not to think about that. You can also prefix each local assignment with local (which has meaning within scoped blocks like let, for, and functions, but not begin...end), but I think let is usually easier.

2 Likes

But here you might as well write

function run!(simulation)
    s=simulation
    while !completed(s)
         step!(s)
    end
    return s
end
4 Likes

I agree. There’s no reason to add an additional nested local scope in that example.

1 Like

This is true for globally-declared functions, but can be hazardous for locally-declared functions. I’ve just made it a habit so my style can be consistent across scopes, since there’s no harm in it.

I also tend to write my named function definitions with f(x) = ... instead of function f(x)...end anyway, so maybe that’s why it feels natural for me.

2 Likes

Yes, thank you for the warning, this feels like something that can bite you and cause befuddlement long down the line.

I didn’t know function f(x) … end and f(x) = begin … end are not the same :open_mouth:

So as a takeaway of that linked julialang issue: I should @code_warntype my closures to make sure no Boxes are introduced?

No they’re the same, but instead of saying f(x) = begin ... end you can also write f(x) = let; ... end and it’s [slightly] shorter and allows you to declare local variables without worrying about scope.

Yeah, I think that’d be a good takeaway. The reason boxes get introduced is because the compiler can’t prove from static analysis that the variable will not be changed during the closure’s lifetime, so when it compiles the function it does the safe thing and boxes the value, and then your performance goes down the drain. To get around that, you can use a let block. For example, you might have this:

map(k->m[k], o)

and somewhere in the parent scope, m gets changed so the closure puts m in a box. If you know that you really just wanted to take a snapshot of m at that very instant, you can wrap it in a let block like this:

map(let m=m; k->m[k] end, o)

There are other, more ergonomic ways to achieve this goal, such as:

let m=m; map(k->m[k], o) end

Anyway, point is that let blocks are awesome and I’m not shy to use them :sweat_smile:

2 Likes

I just got recommended a video on Youtube. I think it’s absolutely spot on, so I wanted to share it with you: Naming Things in Code - YouTube
Let me know what you think!

I’m quite saddened by the general sentiment in this thread, seemingly preferring short or single letter variable names over just the slight inconvenience of thinking about a better name. The purpose of a longer, more descriptive name has nothing at all to do with readability. Sure, short variable names make sense “in context”, IFF you know that context already and haven’t forgotten it since the last time you touched that code. However, if you have lost that context (or maybe you’re reading a codebase for the first time?), longer variable names are incredibly useful for rediscovering that context. That’s what it means to have self-documenting code and that is the purpose of longer names.

I also heavily disagree with the notion that somehow, it’s more ok to have short variable names in libraries and only need longer ones in applications - how is anyone supposed to maintain a (perhaps critical) library if they can’t easily read & dive into the code?

8 Likes

As the (inadvertent) OP I’d like to point out that the ideal is ‘conciseness’, not terseness. There’s a balance to be struck between scrutability and legibility, and conciseness is the name of the balancing point.

In fact, one should expend a lot of energy coming up with short, pregnant, names, it’s one of the important challenges in programming.

The linked thread showcased ‘the worst of both worlds’ imo, long, repetitive names, very similar, visually noisy, and with nearly zero explanatory information. If you don’t already know what they mean, you won’t be able to glean it from reading, and if you, you’d be better off just calling them k and k'.

Not always, but sometimes, library code is significantly more abstract. Variables don’t necessarily carry much meaning. When you look through Base, it is rife with xs, and quite right, there’s no inherent meaning in the inputs to sqrt or sin or max. Meaningful descriptive names should be in the code that calls these functions, like max(number_of_cats, number_of_dogs), ie. In the application code, not the library code.

12 Likes

You’re conflating “descriptive names are better” with “longer names are better”. Sure, having similar looking long variable names is no better than having similar looking, short variable names, but I’d say that is not an argument for shorter variable names, but rather against similar looking names. Those are not the same thing.

I think that those names are just as bad for those functions as they are in “longer” or “application” code. They still do carry meaning, especially since those library functions are very often using implicit assumptions for their correctness (which is a bad thing in itself), reverse engineering of which is significantly easier if the name for a variable is descriptive of what it’s purpose is. That doesn’t mean that every generic loop needs to have a name describing what it exactly is in the greater context of the program as a whole, that’s impossible for libraries, but just that it shouldn’t be “some x”, if it’s at all possible to avoid. E.g. for sin, call the input at least num to be clear that you’re expecting something that at least superficially resembles a number.

For example, there’s this section in the fallback implementation for hypot:

which I think would be much, MUCH more readable with something like this:

# Pythagoras - a² + b² = c², then sqrt(c²)
hypotenuse = sqrt(muladd(abs_x, abs_x, abs_y*abs_y))

# This branch is correctly rounded but requires a native hardware fma.
if Core.Intrinsics.have_fma(typeof(hypotenuse))
    hypotenuse² = hypotenuse * hypotenuse
    abs_x² = abs_x * abs_x
    # error per side
    error_y          = fma(-abs_y, abs_y, hypotenuse² - abs_x²) 
    error_hypotenuse = fma(hypotenuse, hypotenuse, -hypotenuse²)
    error_x          = fma(abs_x, abs_x, -abs_x²)
    
    # correct the error
    # /2 due to each side being use twice, /hypotenuse due to the errors being relative to that 
    hypotenuse -= (error_y + error_hypotenuse - error_x)/(2*hypotenuse)
[...]

Having even a slightly longer names makes “clever” oneliners much less enticing to write, precisely because the line would end up taking up waay too much space. To be clear, I’m not 100% certain that I’ve refactored the code correctly, in part because reading the original gives no clear signal whatsoever what exactly any one of the intermediate results even calculates - which is the entire point we’re discussing here, to save people that either haven’t touched the code in a long time or just want to understand how it works in general some time when understanding it.

I’m not talking about trivial oneliner library functions like max, and I don’t think anyone here is. I’m talking about some longer functions that are rife with x, y, i, j, k, tmp (WHAT is stored here temporarily?!).

We’re not just writing code to have the computer be as fast as possible, we’re also writing code for others to read, which I’d argue is, generally speaking, the most important factor when writing code. ESPECIALLY writing scientific code that is supposed to be easily understood, reproducible and easy to check for subtle mistakes by people other than the original author (who possibly are used to different conventions than the author…!).

5 Likes

Not really, since I am very much criticizing long names, not descriptive names. (Though, I will admit to disliking descriptive names in cases where I think abstractness is better.)

Descriptive names are good. Long names are bad. That’s the trade-off, and the ideal answer is “conciseness”, hard as it might be.

3 Likes