There’s nothing automatic about this, of course, but I at least take this convention as a way of forcing myself to think once or twice more, about how I can avoid long ugly compound names, without resorting to underscores.
Perhaps many of those just don’t have the best possible names yet?
I guess you can never have just short, beautiful expressive names, there will always be warts. But by choosing this convention, the devs are at least consciously pushing the language in a particular direction, and I think that the balance of advantage will ultimately prove to be on the positive side.
To clarify, we don’t no insist that everyone write Julia code without underscores. It is a discipline beautifully explained by DNF to which Base Julia strives to adhere in its public API. In internal functions and in my own code I use underscores all over the place because the factoring of the code doesn’t matter so much. Not all code needs to be reduced to its most atomic elements of design, but that’s what we aim for in the Base language.
Okay my last post as these kinds of things are always a pit. It has been very helpful for me to understand how people see this, even if I fully disagree. I don’t want to just end with the impression that I have not found this discussion interesting!
The thing I really can not understand is how removing underscores is seen as somehow making the variables more “atomic” / “minimal” or pure. I see it simply syntactically, I see no “purity” change between is_denseisdense or isDense they are simply different ways of separating words in a variable name.
If you move from is_dense to isdense you have not found a more atomic expression of your code, you simply removed an underscore. If you need to use an underscore, don’t cheat, use it. If you somehow find a more atomic name great! You have found real beauty. Base has done this latter kind of reduction many times, but I think we are getting into a state were we are cheating instead, separate words should have underscores, removing them does not magically get rid of the need (because you are combining distinct words).
You can do things like multiplying, adding and dividing matrices and vectors without using LinearAlgebra. The things that are exported from LinearAlgebra and not available in Base are things like special matrix types, factorization types, and specialized linear algebra operations like axpy!. Really basic linear algebra operations still don’t require any imports.
The thought was maybe we could standardize on “using underscores consistently”. The trouble with this turns out to be that there are many very well established function names – especially in math– that do not have underscores. This leaves three options:
Use underscores consistently to separate words policy, but be the only language around that spells abspath as abs_path, dirname as dir_name, basename as base_name, lcfirst as lc_first, linspace as lin_space, logdet as log_det, logabsdet as log_abs_det, nullspace as null_space, ordshur as ord_shur, pinv as p_inv, randn as rand_n, readline as read_line, repmat as rep_mat, sortrows as sort_rows, etc. What about acosd? Should that be spelled a_cos_d? If not, then why not? What exactly constitutes a word?
Use underscores except when there’s a strong legacy reason for not doing so. However, this is a pretty subjective judgment call, and leaves the language with a wildly inconsistent experience where – much like English spelling – you have to know the provenance of a name in order to guess how to spell it.
Avoid underscores and try to factor APIs as much as possible so they aren’t missed. This is what we do.
I think it’s easy to focus on a few ugly examples of the “no underscores” approach that are not great (e.g. nbytesavailable), but it’s much harder to actually come up with a consistent policy that doesn’t lead to even more absurd cases (criticism is easier than design). Mathematica takes the approach of spelling out absolutely everything, which is very predictable and very characteristic. The Mathematica-esque names of some of the above names would be something like DirectoryName, FileBaseName, LinearSpace, RepeatMatrix, SortRows and so on. That’s a consistent policy, and a respectable choice, but being equally explicit and verbose (with underscores or camelcase) would lead to a very different feeling language than Julia.
Me too! A number of people tried to come up with best practice suggestions / recommendations for naming, indentation, etc. in Julia, at Julia Praxis · GitHub.
I do think the whole “let’s get rid of underscores” movement for v1.0 has gone a bit far.
If something cannot be refactored using multiple dispatch and composition, then readability is still important, and shouldn’t be sacrificed in this manner.
First see if the functionality can be put in an easy to use API with using multiple dispatch and/or keyword arguments, without performance penalty. (The changes in master seem to be shaping up nicely to help out with that)
Otherwise use underscores to provide longer, readable names (sparse_ones instead of spones, for example). (Hopefully, in most cases No. 1 could make this rarely necessary).
Create packages that provide legacy names (Matlabisms such as cumsum, cumprod etc. come to mind), so that legacy names don’t pollute the Base namespace. That would help resolve the problem you identified about the language feeling inconsistent. (It already feels inconsistent, this could improve things).
Yes!
Because of that, there are names that look ambiguous - isqrt, is that supposed to be a test if something is qrt? (it’s the only one that starts with is that doesn’t mean is something).
Instead, Integer sqrt could be sqrt(Integer, val), for example.
Is it far fetched to have Julia installed with a .juliarc file that imports all the packages in the standard library by default? This will be convenient for completely new users, perhaps even learning programming for the first time in a scientific context, plus the file can be modified by more experienced programmers to suit their own interests if they don’t like all these modules to be automatically imported. I think this will make everyone happy.
Here’s a crazy idea we’re floating in LightGraphs: we have a bunch of functions that start with is_ and has_ (e.g., is_connected, has_edge) that return a boolean. We are considering removing the is_ / has_ and appending a unicode question mark-like glyph (ʔ or ⁇, maybe) so that we have connectedʔ and/or edge⁇ as functions.
The nice thing about this is that since the unicode is at the end, tab-completion would allow users to avoid having to type the characters.
(The reason we want to do this is because we have a few functions that are very long and can be misinterpreted: has_self_loops becomes hasselfloops which is very close to this.
Soliciting thoughts on this idea – please provide input.
@anon94023334, I like the idea of ending with a questionmark, but it needs to be something with a \symbolname so that people can type it at the repl, ⍰ \APLboxquestion or what about U+00294 ʔ \Elzglst Latin Letter Glottal Stop?
Too bad ¿ can’t be used (it’s not an identifier character - at least not currently)!
It’s an ANSI Latin 1 character (i.e. one of the first 256 codepoints of Unicode) (0xbf)
I do prefer is_ and has_ though, quite frankly.
I think what @yuyichao meant was that people will then write their Julia programs expecting the stdlibs to be loaded, which will then break when the global .juliarc file is changed.