Naming: Remove all underscores to matter what?


#1

Looking over the latest 1.0 milestone I see that the underscore apocalypse in Base is beginning to build steam. Am I correct in thinking that julia now wants everything to just be mashed together? Length doesn’t seem to be a real issue anymore as we have beginindex and nbytesavailable coming. I increasingly find names in julia really hard to read, and am resigning myself to have to get used to it, and update my own code so it at least feels consistent. Is this how the rest of the ecosystem is going? I haven’t checked in a while.

The aesthetics of 1.0 is bumming me out, going to be a lot of using <basic science stuff> and weird names :frowning:


Naming conventions for mathematical models
#2

I love it! For the most part I find the mashed-together names readable (multiple dispatch is amenable to short names, so this usually works pretty well) and everything looks so nice and succinct. I like how relatively small the gap is between writing things symbolically and writing them in Julia. I am a little nervous about how many using statements we’ll need, but I haven’t tried it out yet. I think as long as all the basic math is visible by default I’ll be ok. (Definitely do not want Julia to be one of those langauges that has some sort of a using Math to do sin(π).)

Also, for what it’s worth, there weren’t that many underscores before 0.7, at least not that I was seeing.


#3

nbytesavailable is a terrible name but it’s a pretty obscure function that very few people will use. Fwiw, I preferred the name bytesavailable but got overruled by people wanting that n.

The name beginindex is also not great, but the function is mostly not for public consumption. It will typically be accessed by writing something like v[begin+1:end-1] which will lower to

v[beginindex(v)+1:endindex(v)-1]

Of course, some generic code will want to call the function and if you want to extend it, you’ll need to use the name. There was discussion of calling it firstindex and lastindex but it seems odd for those to be the names for begin and end in indexing expressions. We could call it beginof to match the old name endof.

The splitting of Base Julia into standard library packages is absolutely necessary for the language not to collapse under its own weight or stagnate. You’ll note that what’s left in Base now is much more like what other programming languages provide instead of the extreme kitchen sink approach we started out with. The standard library now includes the following:

  • Base64
  • CRC32c
  • Dates
  • DelimitedFiles
  • Distributed
  • FileWatching
  • Future
  • IterativeEigensolvers
  • Libdl
  • LinearAlgebra
  • Logging
  • Mmap
  • Printf
  • Profile
  • REPL
  • Random
  • Serialization
  • SharedArrays
  • SparseArrays
  • SuiteSparse
  • Test
  • Unicode

Which of these do you feel is really essential to have available without doing any import at all?

As a matter of historical interest, there was a period of time where sticking code into the JuliaLang/julia contrib extras directory was how packages were developed. It may have been a mild inconvenience when we introduced a package manager to split that code out, but it’s clearly something we needed to do. This is not fundamentally any different.


#4

The only thing on this list that scares me is LinearAlgebra. Linear algebra is basic math, if it were up to me I’d never hide something basic like eig behind a using LinearAlgebra. On the other hand we don’t have the problem that some other languages have where we need to import operators such as * and + and (oh god, we don’t have to import do we?) so I guess it’s not that bad.

I’d also personally like at least rand to be available without using Random.

There’s no way of objectively justifying any of this because there’s nothing stopping you from doing using, this is just how I feel about it.


#5

You can.


#6

Oh that makes me so happy! :smile:


#7

Thanks so much for this. I think at times I struggle with understanding the changes that happen in Github, but this makes a lot more sense to me now. Looking at that list of possible imports is far less scary then I had made in my mind! I still want more _ but I feel less like the choices are being made in an arbitrary way just to remove spaces between words :slight_smile:


#8

Removing underscores represents an aesthetic preference that, to the decision-makers, is more important than readability.

My approach is to use underscores except when there is a direct analogue in Base (like the is* functions).


#9

I dispute whether this is really the case. I certainly agree that you wouldn’t want stuff like besselfunctionofthesecondkind(a, x), but there’s nothing in Base that looks like that, which is the whole point. Multiple dispatch is particularly conducive to short function names hence the huge library of methods in Base the vast majority of which are no more than 2 words. I agree that particularly verbose function names, when necessary, should have underscores.


#10

I do not understand why the Julia core team decided to sacrifice readability for beauty. Beautiful code is important for me, but not to seperate words either with camel case or with underscores sacrifices readability. I don’t understand this decision. :cry:


#11

OT X-ref for the interested: Run the half-life of code analysis on Julia?


#12

I look at it differently. I don’t think it reflects a preference for mashing words together – instead it is a convention that encourages you to consider how you name identifiers in your code, and, more generally, how you organize the concepts you’re working with.

Strive to choose concise names, not too short, not too long. If you need underscores, it most likely means that you can work harder to find a better name, or perhaps that you are mixing together two concepts that should be separated.

It reminds me about the principle (which has been mentioned many times) of implementing as much of the Julia language in Julia itself, even when dropping down to C or something else would be easier and more efficient. It sounds like a potential handicap, Julia could be faster with more C code in it, but in the long run it’s better. Analogously, discouraging underscores forces you to work harder and do better.

I think that if you stick to the convention, and put in the extra work, your code will in the end be more readable.


#13

I don’t think that ANY language can become more readable by ommiting spaces and by putting everything in one word. In science, you have subscripts and superscripts. You don’t write everything in one word. While programming, to have at least subscripts (by using an underscore) is essential for me. Nothing improves if you remove this degree of freedom.


#14

I don’t think anyone forces you not to name your own functions however you like. I hope this helps?


#15

But that’s the point. Don’t put everything in one word.

In science, you have single-letter identifiers, mostly. You need a lot of subscripts to make up for that.


#16

I think Julia is really the wrong language to attack when it comes to degrees of freedom when writing code:


#17

I don’t think anybody is saying that you shouldn’t add underscores to function names that you feel are too verbose; you can and should do so. The point is that the vast majority of what’s in Base are short names like isbits, randstring, eigval. This occurs precisely because Julia has multiple dispatch as a core paradigm. The argument is that these types of short names are better (and, in my opinion at least, more readable) if they stay short. If you have a variable with a subscript, by all means, add it as either a unicode subscript or an underscore. I’m sure there are a few cases where the devs got overzealous for the sake of consistency, but for the vast majority of what’s in Base I really don’t see readability as being an issue.


#18

At risk of fanning the flames that I have started, but if the convention was to ALWAYS put underscore between variables that are composed of multiple words why would that not also lead people to have names composed of fewer words? if the names is find_shortest_path instead of findshortestpath are you saying that because the latter is far harder to read it leads to better names? Do people make API choices simply because they are scared of spaces and need to be cruel to end users?

Generally I do find that the core devs are careful with the names they pick, but as a rule the current naming conventions and taste for removing all underscores leads to a real adhoc mix of really nice short names, as well as mashed together names of varying readability, and ones with underscores. We are not just moving towards the great short names, but will ultimately have all 3 which I think is the worst outcome.


#19

why is multiple dispatch related to this? I will bet for most of the names we are dealing with we likely have single dispatch and as such really are no different from most OO languages (which as a rule generally have longer names), as well as no real dispatch like languages such as matlab where many of these mashed together names come from. I would argue that mashing as a taste simply comes from the DOS like days when we had limited characters, not because lack of underscores makes anything less readable.

Most newer languages avoid this kind of artificial constraint (Java onward) from my experience. I think it is simply that people hate underscores as you would never have this issue with camelCase as people are happy to change the capitalization if they don’t have to add a character.


#20

Because much of the functionality of the function is specified not only by its name but by the types used as input. The opposite extreme would be Python where function arguments are a free-for-all and you need incredibly verbose names. Intermediate ground would be Java or C++ where you have some dispatch and most functions belong to a class.