I understand that I need to use : operator to represent a variable in an unevaluated expression in Julia. But I am confused as to why I am using : to get column name here - the column name do not appear to be an unevaluated expression.
using RDatasets, MLJ
iris = dataset("datasets", "iris")
iris2 = coerce(iris, :Species=> OrderedFactor) # This line
Is it possible to refer to a column name here without using unevaluated expression?
: doesn’t create an unevaluated expression, it creates a Symbol, which is a more widely-used type. You can learn more about it here:
The type of object used to represent identifiers in parsed julia code (ASTs). Also often used as a name or label to
identify an entity (e.g. as a dictionary key). Symbols can be entered using the : quote operator:
julia> x = 42
Symbols can also be constructed from strings or other values by calling the constructor Symbol(x...).
Symbols are immutable and should be compared using ===. The implementation re-uses the same object for all Symbols
with the same name, so comparison tends to be efficient (it can just compare pointers).
Unlike strings, Symbols are "atomic" or "scalar" entities that do not support iteration over characters.
julia > dump(:(2+2))
head: Symbol call
1: Symbol +
2: Int64 2
3: Int64 2
From the last line is appears Symbol can be a part of expression.
Symbols are replaced with the value they are referring to when evaluated. But why we need the column name to go through parse > evaluation? Why can’t we use a string or iris.Species to denote the column name?
I’m not familiar with MLJ. But the reason this doesn’t work is because of the types and the way julia evaluate expressions.
is just a vector. There is no name attached to it. It’s totally forgotten that it ever came from a data frame. so the function coerce is given a data frame and then a vector with no name. That’s not a lot to work with.