Recommended style guide for naming functions is not practical

I’ve been thinking about this a lot lately, as I’m getting closer to a Genie v1.0 release and I want to finalize and stabilize the API. And I have a big big big problem with the recommended guidelines for naming functions. Namely, it involves a large amount of subjectivity - and this doesn’t work well in practice.

functions are lowercase (maximum, convert) and, when readable, with multiple words squashed together (isequal, haskey). When necessary, use underscores as word separators. Underscores are also used to indicate a combination of concepts (remotecall_fetch) as a more efficient implementation of fetch(remotecall(...)) ) or as modifiers.

If a function name requires multiple words, consider whether it might represent more than one concept and might be better split into pieces.

Words like “when readable”, “when necessary”, “combination of concepts”, “consider whether”, “it might represent”, “might be better” are riddled with subjectivity and ambiguity. They will inevitably lead to different expectations between package authors and package users, and between beginners and seniors.


I can tell you from personal experience of over a decade of programming in PHP, that mixing squashedcase with snake_case is one of the biggest problems for beginners. They lead to wasted time and energy and introduce countless bugs. These beginners might be your teammates soon. It’s for this reason that the various successful PHP frameworks introduced well-designed style guides.

It’s also for this reason that in many professional work environments, style guides are designed, adopted and enforced. Enforcing is done through a style checker which is part of the CI stack and basically every language has at least a few good ones. Which brings me to my point: Julia’s lax naming convention can’t be automatically enforced. If one wants to design clear rules, you’d have to go either with squashedcase, or snake_case.

And although I honestly do like the simplicity of isnull, isdir, etc the problem with these is that they don’t scale. I actually happily started to move the Genie API to squashedcase and I was very happy with newapp and even newmodel – but I was horrified by the idea of loadresources and setupwindowsbinfiles.

Obviously, longfunctionnames would be a cognitive nightmare - but it’s not even a problem of length. Names like readfile are OK while readdir not so much because dd (if you don’t agree you’re actually enforcing my point that this is 100% subjective).

We could come up with extra rules but they would be either too complicated for humans and/or checkers. For example:

  • if name longer than 10 letters, use snake_case. Easy for the style checker to enforce, a nightmare for humans.
  • if two joined words lead to same letter doubled, use snake_case. Easy for humans, difficult for style checkers (ie: islessthan which would require a dictionary to understand the squashed words).

I am reluctant to make a recommendation in order to avoid the classic pitfall of the code styles flame wars. But it seems to me that the only style that works in any situation is snake_case. I also see this style widely used by package authors.

18 Likes

I am not sure we have the power of enforcing anything — people are free to create their own packages and name functions the way they want.

is true because these things are very subjective and require judgement from the programmer. When to separate function names is part of naming, which is part of the API design, which is more of an art than a science. Enforcing a particular way just to resolve a minor ambiguity does not help much: style recommendations can be ambiguous precisely when there are multiple reasonable ways of doing something.

I am not sure about this — when I review questions in the First steps category, this does not stand out as one of the major problems.

So, hopefully, this problem is self-regulating: as packages mature, authors periodically revise the API and make it more consistent.

That said, I think the right solution is to create small, consistent APIs, and document them well — then how snake_case is used is a minor side issue. But good API design is really hard.

2 Likes

A shared style guide is a default best practice for development teams in professional environments. The team lead or the CTO has the power to and does enforce the style guide. Style checkers are extensively used and are part of standard CI funnels.

1 Like

Sure, in your own team/company/…, you do what you like.

But I was under the impression that you were proposing something for the whole Julia community. Did I misunderstand, and are you looking for guidance on formulating an internal style guide for some team?

I can tell you from personal experience of over a decade of programming in PHP, that mixing squashedcase with snake_case is one of the biggest problems for beginners. They lead to wasted time and energy and introduce countless bugs.

One thing I don’t quite understand is why it’s bad if naming is subjective – ok, sure, sometimes you have to live with a name you don’t like as a user, but it shouldn’t ever mean that you can’t find a name.
Every sane autocompletion/search engine shouldn’t particularly care if there’s an underscore somewhere in the function name. And to be fair, that isn’t completely true for the REPL completions, but that just means they could/should be improved.

2 Likes

I would like my coding style to be consistent with Julia’s recommendations, sure. This would ensure a consistent style when combined with Julia’s base and with 3rd party packages. My point is that the current style recommendation doesn’t work with automated parsers.

That being said, obviously, I would prefer if there would be a commonly used style guide, regardless of what that is. Mixing multiple styles between base, 3rd party and my own code in one line does not make for a great coding experience.

1 Like

The problem is that you’re typing something and you’re not sure if the name of the function is installkernel or install_kernel. Or parseok or parse_ok. Or flushall or flush_all.

In case you’re curious, it’s installkernel, parseok and flush_all and they are all from IJulia (nothing particular about it, just a random example of a package I had installed).

3 Likes

I’ve got no experience of professional coding environments, but from my limited knowledge visiting Slack, Discourse, and Stack Overflow, beginners’ problems are mostly caused by:

  • not reading the documentation

  • not reading up-to-date documentation

  • not finding the right function or package due to the less than optimal search tools

  • not getting simple things working before trying harder things

  • not using the latest versions of packages

  • expecting coding styles and examples from Python/Matlab etc to just work in Julia

I’ve not seen squashedcase vs snake_case mentioned as one of the biggest issues for Julia beginners though. Is it a major problem?

5 Likes

For what it’s worth, naming in my packages follows this approximate path:
small and short functions with shortish names (stuff like getparameters), as the package grows I find the need for more complicated names, I then start using snake_case, I try to improve things and find that in some of the snake_cases I can fragment them into multiple parts and return to my short names.

So while it is a good thing that I am reminded and encouraged to avoid snake_cases (cause it improves the quality of my code), in some cases it is impossible. Which I think is the gist of what the style guide says: “When necessary, use underscores as word separators” (link to docs).

If I understood you correctly @essenciary (and btw, awesome news about the new Genie!), you’d suggest (without inflaming a war) a style where all words are separated by underscore? And to retain the pull for “separated functions for separated concepts”, you’d just have a detailed recommendation in the style guide?

I just want to add my two cents, as a Python developer. Python devs love PEP 8 (PEP 8 – Style Guide for Python Code | peps.python.org) which is the style guide for Python coding and “we” love it as much as it is more or less a bible. I’d say most of the Python devs use this style and take it as it is. To me, there is no real wrong or false, there is just one agreement which creates uniformity.

In my opinion there are simply too many “good solutions” to a perfect coding style (like naming conventions, number of spaces to indent, etc.) and I had never problems to stick to one (C++, Haskell, Python, you name it).

I personally would love if Julia had something similar to PEP 8, so I don’t have to think about it myself, since all in all, it’s just a style convention. This indeed helps with getting things memorised…

7 Likes

I’ve been thinking about posting a similar concern lately. Not because I think it is a great hole in the ecosystem buy because I often will come back to a function after a day of other stuff and think “Why did I put an underscore in there?!” Then I find/replace it with how I feel that day.

I admit it takes less than 20 seconds for me to deal with and is a poor practice that represents bad planning. However, I have often thought that something like pep8 would be nice. As long as we maintain the friendly environment here and don’t berate people for posting poorly syntaxed code their first time

Well, I also find this a bit frustrating, and would prefer if there were a lot more underscores in function names in general for consistency, and I think as the language matures, the community will gather around something more consistent. With that said, I think you’re going out on a limb with this statement:

I think the problems associated with squashed / snake case are dwarfed by other issues which can’t easily be captured by style checking. To mention a few:

  • repeated (WET) logic
  • poorly structured code
  • re-implementing things that already exist in libraries
  • premature optimization
  • unnecessarily complicated/verbose code
  • bad choice of data structures
  • mutable structures which should be immutable
  • not making defensive copies where needed
  • not using abstract / concrete types properly
  • not minimizing the scope of variables
  • code prone to race conditions
  • code where internal structures are exposed
  • usage of strings for things like keys where proper types should be used
  • using exceptions for non-exceptional conditions
  • not using proper exception types
  • bad variable naming in general
  • lack of documentation
  • poor unit testing (ok, this one is more easily enforced)

I would put casing towards the bottom of that list in terms of importance and introduction of bugs. To be honest, when developers use casing, or other very easily detected style issues, to determine the quality of a piece of code, I often consider that a sign of a beginner programmer.

5 Likes

you’re not sure if the name of the function is installkernel or install_kernel . Or parseok or parse_ok . Or flushall or flush_all

I found out that in Julia
parseok = parse_ok
flushall = flush_all
Problem solved. You could even automate that in your package build.

Oh - my coat? Thankyou. Taxi for johnh!

Here is the JuMP style guide that we have adopted.

Compared to the Julia style guide we

  • always use snake_case for local variables
  • use ! sparingly
  • begin top-level private functions and constants with _
5 Likes

@cormullion @bennedich While the problems of beginners and how to help them are very important, I’d prefer to discuss them in a separate topic. If the word “beginner” has some special connotation for you, please replace it with “junior developer”. My point, in other words, was this: PHP’s API is a mishmash of squashedcase and snake_case functions and the first few years developers keep wondering “is this function with or without underscore”. After a couple of years of hands-on experience, (more senior) developers usually just memorize which is which. I wouldn’t want Julia to suffer from the same issue.

To summarize:

  • Julia has a style guide
  • which completely defines the rules for macros and modules
  • but provides vague rules for function names, to the point that the output is indeterminate (one can not tell how a function will be named by following the guide alone)
  • I have yet to see an argument why not having a deterministic set of rules for naming functions is a good thing
2 Likes

Actually, that’s what I’m doing as I’m “migrating” the API: const foobar = foo_bar. This, however, has the annoying effect that it ruins autocompletion in the REPL - if you type foo\TAB it won’t work as now there are 2 functions starting with foo.

@odow Thanks for sharing. I find myself using the _ prefix as well as of lately. I’m now starting to realize that having all the API exposed is a pain when refactoring. Since there’s no way for users to know what is meant to be consumed, I have to assume that any change in API can break a user’s implementation. Interestingly, it’s yet another problem which plagued the PHP world some 10 years ago, when this convention was the standard before proper private qualifiers were introduced.

:slight_smile:

julia> flush_all_external_handles(x) = sqrt(x);

julia> s = "flush_all_external_handles"; p = split(s, '_'); n = length(p);

julia> eval(Meta.parse("const " * join(join.(collect(Iterators.product(ntuple(i -> [p[i], p[i]*'_'], n-1)..., [p[n]]))[:]), " = ")));

julia> varinfo()
name                            size summary
–––––––––––––––––––––––––– ––––––––– ––––––––––––––––––––––––––––––––––––––
flush_all_external_handles   0 bytes typeof(flush_all_external_handles)
flush_all_externalhandles    0 bytes typeof(flush_all_external_handles)
flush_allexternal_handles    0 bytes typeof(flush_all_external_handles)
flush_allexternalhandles     0 bytes typeof(flush_all_external_handles)
flushall_external_handles    0 bytes typeof(flush_all_external_handles)
flushall_externalhandles     0 bytes typeof(flush_all_external_handles)
flushallexternal_handles     0 bytes typeof(flush_all_external_handles)
flushallexternalhandles      0 bytes typeof(flush_all_external_handles)
n                            8 bytes Int64
p                          304 bytes 4-element Array{SubString{String},1}
s                           34 bytes String

julia> flushall_externalhandles(25)
5.0
1 Like

@essenciary I agree with you regarding that behaviour in the REPL
Look at this behaviour also. Let us call a package Mypackage
Enter help by typing ? then Mypackage Nothing happens
But type Mypackage. and you get a lot if the functions.
A minor annoyance, but it tok me a while to figure that out.

I haven’t decided what I think about “snake_case” vs “squashedcase”, but this I don’t get. Why would anyone need to memorize this? You simply type snake\TAB and you automatically get the correct one.

You will get a dropdown list with two (or more) alternatives.

I don’t see how mixing styles makes any trouble when it comes to remembering anything, because you don’t have to. It is, however, a matter of aesthetics, which I agree is important.

2 Likes

Maybe my Atom editor is not properly configured, but no, I don’t automatically get anything. For example, creating a new .jl file and typing printst triggers no autocomplete and printst\TAB results in a TAB being appended (so no printstyled completed). If you have a better configuration, please share.

Same goes for the REPL - there’s no dropdown. You do get a list of options in the REPL, it’s true, but it still requires extra typing in order pick the one you want.

But even so, I hope nobody actually considers that it’s a good idea to duplicate function names to provide APIs with both underscores and without. This solution makes sense during a short period of time for deprecations and for migrating the users - but as a standard, just to avoid having an actual standard, not so much.