Why does Julia prefix columnames and colors with a colon :?

Juan · December 1, 2018, 12:42am

Why does Julia prefix variables and colors with a colon : ?

For example:
iris[:SepalWidth]

plot!(collect(1:10), rand(10), color=:red, label="red")

alejandromerchan · December 1, 2018, 12:45am

That’s the representation for symbols. See Metaprogramming · The Julia Language

kristoffer.carlsson · December 1, 2018, 12:47am

Just writing red would mean the variable red.

Juan · December 1, 2018, 1:40am

And what would this mean:

iris[SepalWidth]

PetrKryslUCSD · December 1, 2018, 2:06am

That depends on whether SepalWidth is defined or not. It would work if it was a variable with value

SepalWidth = :SepalWidth

You can think of symbols as convenient values that have a special role in Julia programs. Presumably here iris is a dictionary with keys that are symbols.

stevengj · December 1, 2018, 2:08am

See also: Symbol (programming) - Wikipedia

Juan · December 1, 2018, 4:57pm

I meant if it’s not a variable but a column name.

PetrKryslUCSD · December 1, 2018, 5:00pm

If the item (SepalWidth) isn’t recognized as a value (such as integer, string, or a symbol), it will be looked up as a variable name. If the variable doesn’t exist, an error results.

StefanKarpinski · December 1, 2018, 7:29pm

Note that you can also write iris.SepalWidth:

julia> iris.SepalWidth
150-element Array{Float64,1}:
 3.5
 3.0
 3.2
 3.1

y4lu · December 2, 2018, 8:18am

They are actually a nice feature, it’s probably a bit unfortunate that dataframes doesn’t also let you use the more ordinary indexing style with Ints just as easily. I’m pretty sure it is still the case, so it’s tempting to guess it was a design decision

xplot(x,y, c, l) = plot!(x, y, color=Symbol(c), label=String(l));
xplot(collect(1:10), rand(10), "red", :red)

Tamas_Papp · December 2, 2018, 8:22am

Since column positions can be pretty accidental, I don’t think it is robust practice to use them for indexing in a dataframe.

y4lu · December 2, 2018, 8:31am

That is true, but you could easily still opt in by using symbols, and have Ints as a fallback
Add: >?DataFrame does have a few examples with Integer indexing

Tamas_Papp · December 2, 2018, 9:20am

Note that

is not correct — it works just fine, see the relevant methods in the source.

I was merely pointing out that while it works for DataFrames, I don’t think it is good practice for working with data. Merely regenerating data with a slightly different column order or an extra column can break code very easily.

y4lu · December 2, 2018, 9:38am

Yes i just spotted that, and it’s mostly normal too. df[1] and df[:,1] are a little different still (or subject to change?) compared to operations on df2 = reduce(hcat, getfields(df, :columns))

Tamas_Papp · December 2, 2018, 9:43am

I don’t quite understand what you mean here.

I don’t know what getfields is. Did you misspell getfield? In any case, there is a columns accessor.

y4lu · December 2, 2018, 9:49am

That would be correct

There was a df.columns, i’m not sure if there’s others

df[:,1] gives a copy warning, and df2[1] is scalar

y4lu · December 2, 2018, 10:09am

So how do you do matrix multiplication?

Tamas_Papp · December 2, 2018, 10:41am

I mean the column(::DataFrame), which is exported and part of the API. You should avoid accessing fields of a composite type unless that is explicitly declared as the way to work work with them.

Generally, for eg OLS, you form a design matrix. That is not a direct hcat of the columns though, eg for categorical variables, etc.

y4lu · December 2, 2018, 11:03am

This one? https://juliastats.github.io/StatsModels.jl/latest/formula.html

.- I think i can more or less sum up my view as being ‘Documentation is great when you need it, but not needing it is still kind of preferable’, fwiw
I’m quite probably terrible at API design myself though =) Singling out dataframes because it’s commonly one of the first packages people might use, and that is when user friendliness really either shines or hurts

y4lu · December 3, 2018, 5:30am

columns() is in the unexported names on DataFrames v0.14.1

Topic		Replies	Views
Newbie Syntax Help: Colon in Index Syntax ---> sepal_length_column = iris[:Sepal_Length] General Usage	6	814	July 2, 2018
What is the DataFramesMeta way to specify a column by its name in a variable? General Usage question , dataframes	6	1036	March 17, 2020
What does a function argument preceded by a colon do New to Julia functions	9	579	July 29, 2023
DataFrame colon : vs bang ! indexing New to Julia	4	709	November 7, 2022
Just an observation: dotted variables New to Julia windows	7	1896	April 11, 2019

Why does Julia prefix columnames and colors with a colon :?

Related topics