I have used pandas in Python to typeset a table in Jupyter notebook (standard Anaconda installation, nothing fancy). The data consisted in a simple list of dictionaries: using pandas, this resulted in a table with dictionary keys as headline, and dictionary values in the rows below.
Question: what is the command for doing the same with data frames in IJulia?
Example of data: sr_org.getQuantities()[0:2]
leading to:
[{'Changeable': 'false', 'Description': 'Initializing temperature in reactor, K', 'Name': 'T', 'Value': None, 'Variability': 'continuous', 'alias': 'noAlias', 'aliasvariable': None}, {'Changeable': 'false', 'Description': 'Initializing concentration of A in reactor, mol/L', 'Name': 'cA', 'Value': None, 'Variability': 'continuous', 'alias': 'noAlias', 'aliasvariable': None}]
This looks as follows in Python/Jupyter (import pandas as pd):
using DataFrames
function df_from_dicts(arr::AbstractArray; missing_value=missing)
cols = Set{Symbol}()
for di in arr union!(cols, keys(di)) end
df = DataFrame()
for col=cols
df[col] = [get(di, col, missing_value) for di=arr]
end
return df
end
df_from_dicts(your_vector_of_dicts)
Sorry that my example was a bit terse; I was short on time.
DataFrames in Julia use symbols for column names (Pandas uses strings), so it’s logical to use symbols in the input to df_from_dicts. Technically, Dicts can map from anything to anything.
You mean, in general? AFAIK, symbols in Julia are “interned”, which means that comparing/hashing them is really fast (O(1): like comparing two numbers), whereas string comparison is, I believe, O(N).
You could use df[Symbol(col)] = [get(di, col, missing_value) for di=arr] in the loop above (and use a Set{String}(). As much as possible, I would favour manipulating symbols over strings.
julia> String(sy4)
"der(T)"
julia> String(sy5)
MethodError: Cannot `convert` an object of type Expr to an object of type String
This may have arisen from a call to the constructor String(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] String(::Expr) at .\sysimg.jl:77
[2] include_string(::String, ::String) at .\loading.jl:522
The quoting syntax :( ... ) returns objects of various types.
Both :a and :(a) return the symbol :a.
:(1) returns an Int with value 1.
:(der(T)) returns an expression, an object of type Expr.
julia> dump(:(der(T)))
Expr
head: Symbol call
args: Array{Any}((2,))
1: Symbol der
2: Symbol T
typ: Any
These are exactly the objects that Julia returns when parsing code into a syntax tree.
This
julia> sy4 = Symbol("der(T)")
Symbol("der(T)")
creates a Symbol. Julia displays all objects as strings in such a way that they can be reconstructed by parsing and evaluating. But, :(der(T)) gives an expression rather than a symbol. So Julia displays this Symbol as Symbol("der(T)").
Finally, you can convert an expression to a string:
Both of them are valid Dict keys, but none of them are typeset as table headings in IJulia using the df_from_dicts() function. However, simpler :(...) objects are typeset as a table heading…
Yes, it takes a little bit of study. Maybe the best place to start is the “Metaprogramming” section of the Julia manual. In particular, it says
The : character has two syntactic purposes in Julia.The first form creates a Symbol, an interned string used as one building-block of expressions.
This does not seem quite correct to me. The two purposes are really almost the same. For example :a and :(a) both return the Symbola. And :1 and :(1) both return the integer 1. It’s just that, if an expression is complicated enough, it must be enclosed in parens in order to be parsed correctly.
Any object can be a Dict key in Julia. But only Symbols are allowed as the names of columns in DataFrames.
A couple of years ago, IIUC, column names were required to be valid identifiers. Apparently its possible to use any symbol now.
If you want to be sure you are constructing a Symbol and not an expression, use Symbol("..."). For example Symbol("a") and Symbol("1") return Symbols, in contrast to the the examples using : above.