Hi all
got a string vector
col_names = ["start","interval","goal","num_hedgeHogs"]
I want to create an empty dataframe using col_names as the column names and a data type of Any .
I thought this might do the trick
df_test = DataFrame( Symbol.(col_names))
but it get’s me
ERROR: ArgumentError: 'Vector{Symbol}' iterates 'Symbol' values, which doesn't satisfy the Tables.jl `AbstractRow` interface
Stacktrace:
[1] invalidtable(#unused#::Vector{Symbol}, #unused#::Symbol)
@ Tables ~/.julia/packages/Tables/PxO1m/src/tofromdatavalues.jl:42
[2] iterate
@ ~/.julia/packages/Tables/PxO1m/src/tofromdatavalues.jl:48 [inlined]
[3] buildcolumns
@ ~/.julia/packages/Tables/PxO1m/src/fallbacks.jl:197 [inlined]
[4] columns
@ ~/.julia/packages/Tables/PxO1m/src/fallbacks.jl:260 [inlined]
[5] DataFrame(x::Vector{Symbol}; copycols::Nothing)
@ DataFrames ~/.julia/packages/DataFrames/MA4YO/src/other/tables.jl:58
[6] DataFrame(x::Vector{Symbol})
@ DataFrames ~/.julia/packages/DataFrames/MA4YO/src/other/tables.jl:49
[7] top-level scope
@ REPL[43]:1
and
df_test = DataFrame(Any[],Symbol.(col_names))
gets
julia> df_test = DataFrame(Any[],Symbol.(col_names))
ERROR: DimensionMismatch("Number of columns (0) and number of column names (9) are not equal")
Stacktrace:
[1] DataFrame(columns::Vector{AbstractVector}, colindex::DataFrames.Index; copycols::Bool)
@ DataFrames ~/.julia/packages/DataFrames/MA4YO/src/dataframe/dataframe.jl:178
[2] DataFrame(columns::Vector{Any}, cnames::Vector{Symbol}; makeunique::Bool, copycols::Bool)
@ DataFrames ~/.julia/packages/DataFrames/MA4YO/src/dataframe/dataframe.jl:322
[3] DataFrame(columns::Vector{Any}, cnames::Vector{Symbol})
@ DataFrames ~/.julia/packages/DataFrames/MA4YO/src/dataframe/dataframe.jl:319
[4] top-level scope
@ REPL[5]:1
julia>
any help appreciated .
You would need
DataFrame([Any[] for i in 1:length(col_names)], col_names)
This has the unfortunate consequence of making your columns of type Any
, which is not ideal.
1 Like
first of all thank you for the speedy reply. I want to set myself a decent foundation in julia so if there is another approach which is more rational I am more than happy to listen.
secondly. I understand that defining all the columns as Any is a bad move but how do I stipulate the type of the columns? So in my case ( for example)
"start" is a timestamp
"interval" is a int63
"goal" is a string
"num_hedgeHogs" is an integer.
You can declare empty arrays of a type like String[]
, or Int[]
or DateTime[]
.
but in my case I am creating the dataframe from a string vector for the column names, how would I define the column types using
DataFrame([Any[] for i in 1:length(col_names)], col_names)
Is there a better approach?
First, you can take advantage of the fact that DataFrame
constructor by default copies columns so you can write:
DataFrame(col_names .=> Ref([]))
Now if you want different element types for columns you can write e.g.:
DataFrame(col_names .=> [T[] for T in [Int, String, Bool, Char]])
or (via a small hack):
DataFrame(col_names .=> rand.([Int, String, Bool, Char], 0))
3 Likes
thanks for the reply. As always lashings to build on. I didn’t know anything about Ref( ) so that’s something to look into.
Regarding
DataFrame(col_names .=> rand.([Int, String, Bool, Char], 0))
Is this a trap to my noob eye this seems to randomize the allocation of a type to a column. Admittedly this would add a bit of fun to the process but…
anon69491625:
Is this a trap
No, it is not a trap and it is fully non-random (that is why I say it is a hack). We create 0-length vectors of types specified by the first argument:
julia> rand.([Int, String, Bool, Char], 0)
4-element Vector{Vector}:
Int64[]
String[]
Bool[]
Char[]
The trick is that since the length of all vectors is 0
it works even for data types that are not supported by random number generator.
2 Likes
just checking after the “get us a bubble for the spirit level” incident
thanks again for the considered reply and the feast of food for thought.