Too many ways to do the same thing in Julia

Sure, it gives freedom. But if you are reading someone else code it means you have to learn all the possible ways.
And it may lead to unexpected inconsistencies.

Some examples.

Use the new package manager (using Pkg):

  1. pkg> add DataFrames
  2. julia> pkg"add DataFrames"
  3. julia> Pkg.add("DataFrames")

But:

  1. pkg> add https://github.com/JuliaData/DataFrames.jl.git
  2. julia> pkg"add https://github.com/JuliaData/DataFrames.jl.git"
  3. julia> Pkg.add("https://github.com/JuliaData/DataFrames.jl.git")
    ERROR: https://github.com/JuliaData/DataFrames.jl.git is not a valid packagename

Concatenate strings:

a=“one”; b=“two” :

  1. c = string(a,b)
  2. c ="$a$b"
  3. c = a*b

But a=1; b=2:

  1. c = string(a,b) # "12"
  2. c ="$a$b" # "12"
  3. c = a*b # 2

Create a column vector:

  1. a = [1;2;3]
  2. a = [1,2,3]

Create an empty (zero-elements) array:

  1. a = T[]
  2. a = Array{T,1}()
  3. a = Vector{T}()
1 Like

image

8 Likes

So * isn’t the same thing as string in a generic code?

Personally I think it’s fine to have some aliases like Vector{T} = Array{T,1}. Most of the other examples come apart in practice. T[] is the same as Vector{T}(), but not once you put any elements in there: the no element case is just the corner case where they are the same. a = [1;2;3] is the same as a = [1,2,3] only for numbers, a case like a = [[1],[2],[3]] makes the , vs ; have a different meaning. Similarly, the Pkg module, the pkg string macro, and the pkg REPL are similar but different entities. Using the pkg REPL mode you get tab completion and other goodies not available from the pure commands, while the commands are required if you are writing a script that does package manager commands. The only case I think is actually equivalent here is string(a,b) == "$a$b".

There should be one-- and preferably only one --obvious way to do it.
:slight_smile::slight_smile:

1 Like

I honestly don’t see why this problematic, these are different functionalities (even if they share an interface): one adds a package from the registry, one from the repo directly.

1 Like

I don’t find that problematic. Only that there are three ways to run a package command (issuing the command directly in the pkg mode, using the pkg string macro, using the functions in the Pkg package), but as I shown these three ways are not fully equivalent, and one has to learn their subtle differences…

This one seems like a bug to be honest (or maybe it’s a compatibility thing).

Well, The Zen of Python is probably one of the reasons why Julia has ended up as a much nicer and more powerful language than Python :slight_smile:

To address your concerns one would either have to remove syntactic sugar, or artificially restrict the language in other ways. Perhaps removing multiple dispatch would help with your * problem?

4 Likes

Unfortunately, we have to wait for 2.0 to do that, as it would be breaking. :smile:

2 Likes

They are not the same thing.

julia> @which [1;2;3]
vcat(X::T...) where T<:Number in Base at abstractarray.jl:1202

julia> @which [1,2,3]
vect(X::T...) where T in Base at array.jl:130

I admit this is frustrating, but it is not a problem of the language but a problem of lack of good tutorial.

Mhà… in this specific case I am on the idea that it would have been better to remove * for strings, as there is already the string() function and the interpolation. The added convenience may not be justified by the extra complexity of the language (for example a=1;b=2;c=a*" "*c; behaves differently in Julia 0.6 and 1.0. It’s fine for you that are inside the language but a bit frustrating for newcomers, at least that’s my opinion…)

The [1;2;3] syntax is for concatenation which is a special case of the vcat shortcut syntax in [[1]; [2]; [3]] for example. This can be convenient if you are writing generic code that concatenates some inputs whether they are scalars or vectors or a mix. The * syntax is really convenient if you want to concatenate 2 strings without writing the long word string. The ^ works as expected when writing something like "*"^10, that simply makes "**********"; writing that manually would be a pain. The $ is for interpolation so using it to concatenate 2 strings is your choice but it is much more general. Also Julia is not the only language with these functions, so you can argue that all programming languages have some sort of redundancy but each feature really shines in some place where others don’t even if for purely aesthetic reasons.

Finally, T[] is a convenient way of constructing a vector of element type T. But what if you want to make a matrix or more generally any array, then you can call the Array{T, N} constructor directly. Vector{T} is alias for Array{T,1} so it really comes for free since we already have a definition for the general Array constructor. I guess what I am getting at here is that each feature has a good reason to be there. Some may be more commonly useful than others but having all there enriches the language and gives programmers more options. Just my 2 cents.

3 Likes

no, because you won’t always call string. * is just the standard operator for non-commutative group algebras. String concatenation satisfies that, so I think it’s fine for strings to work in generic functions which are designed for that set of actions.

2 Likes

Thinking about the issue in general, I am not sure that avoiding multiple ways to do the same specific thing is a viable design principle for a language with a rich type system, (informal) interfaces tied to generic functions, and multiple dispatch.

For example, think about

z = 1:3
map(x -> x + 1, z)
[x + 1 for x in z]
z .+ 1

which more or less do the “same thing”, but each is subtly different, because it is a special case of a more general interface:

  1. map can operate on multiple collections and provide specific result types, eg map(x -> x + 1, tuple(z...)),
  2. comprehensions can provide multiple dimensions and filtering,
  3. broadcasting is again a generalization along other lines.

Now, if we had fully overlapping designs (eg two ways to broadcast), that would indeed be bad language design, but examples which are special cases of broader but different behaviors isn’t necessarily.

11 Likes

While I am in agreement that in general having more than one obvious way to do things are is bad.
I am also in agreement that most of these things are things that happen to be identical for those cases,
but are more generally different.

Consider even python has the same things to a rough approximation.

String concatenation

In [20]: a="one"; b="two"

In [22]: a + b
Out[22]: 'onetwo'

In [23]: "%s%s" % (a,b)
Out[23]: 'onetwo'

Those are basically the same operations you highlight for julia.

Except string
I kind of think removing the multiple argument version of string would be good.
So that it only does convert this thing to a string representation.
Having it also do concatenation is not great, since that seems unrelated to me.

In [24]: a=1; b=2

In [26]: a + b
Out[26]: 3

In [27]: "%s%s" % (a,b)
Out[27]: '12'

Python even has the same kinda behavior here.

Create a row vector:

(I say row since python is row major).

In [30]: np.asanyarray([1,2,3])
Out[30]: array([1, 2, 3])

In [32]: np.block([1,2,3])
Out[32]: array([1, 2, 3])

np.block is actually the equivalent of how [a;b] and [a b; c d] work.
So these really are the direct translations.

Create an Empty array

In [37]: np.empty((0))
Out[37]: array([], dtype=float64)

In [40]: np.ndarray((0))
Out[40]: array([], dtype=float64)


In [38]: np.asanyarray([])
Out[38]: array([], dtype=float64)

I think these are about the same.
Though I am not 100% sure with them.
Point is there are multiple ways.

7 Likes

The first two are equivalent and are supposed to be used interactively, the second is only provided because you don’t have access to a REPL at some point.

The third one is completely different and is used from within source code. Here you want to be more expressive and not mix up things like the name of the package and a URL which is why it is more stringent.
This is nothing different than that there might be a difference between the CLI interface to a library and the API of the library.

a + b
"%s%s" % (a, b)
"{}{}".format(a, b)
''.join([a,b])

etc.

and

f"{a}{b}"