Suppose I have a DataFrame with two fields: idx
and date
. The date
field has missing values (in the DataFrames
sense) and is currently stored in the DataFrame as a string. Is there a query statement that I can write which parses the string into a date? I tried something like this:
df2 = @from i in df begin
@select {i.idx, date = Date.(i.date, "mm/dd/yyyy")}
@collect DataFrame
end
but got an error like this:
ERROR: type UnionAll has no field parameters
Stacktrace:
[1] column_types at /Users/tcovert/.julia/v0.6/IterableTables/src/utilities.jl:20 [inlined]
[2] _DataFrame(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},_} where _,Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##11#13}) at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:105
[3] DataFrames.DataFrame(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},_} where _,Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##11#13}) at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:128
[4] collect(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},_} where _,Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##11#13}, ::Type{DataFrames.DataFrame}) at /Users/tcovert/.julia/v0.6/Query/src/sinks/sink_type.jl:2
I also tried a version with no dot-broadcasting:
df2 = @from i in df begin
@select {i.idx, date = Date(i.date, "mm/dd/yyyy")}
@collect DataFrame
end
and got this error:
ERROR: MethodError: Cannot `convert` an object of type DataValues.DataValue{String} to an object of type Int64
This may have arisen from a call to the constructor Int64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] next at /Users/tcovert/.julia/v0.6/Query/src/enumerable/enumerable_select.jl:41 [inlined]
[2] macro expansion at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:91 [inlined]
[3] _filldf(::Tuple{DataArrays.DataArray{Int64,1},Array{Date,1}}, ::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},Date},Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##15#16}) at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:79
[4] _DataFrame(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},Date},Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##15#16}) at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:119
[5] DataFrames.DataFrame(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},Date},Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##15#16}) at /Users/tcovert/.julia/v0.6/IterableTables/src/integrations/dataframes.jl:128
[6] collect(::Query.EnumerableSelect{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},Date},Query.EnumerableIterable{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},IterableTables.DataFrameIterator{NamedTuples._NT_idx_date{DataValues.DataValue{Int64},DataValues.DataValue{String}},Tuple{DataArrays.DataArray{Int64,1},DataArrays.DataArray{String,1}}}},##15#16}, ::Type{DataFrames.DataFrame}) at /Users/tcovert/.julia/v0.6/Query/src/sinks/sink_type.jl:2
is what I am trying to do possible? if so, what am I doing wrong?
thanks in advance for any suggestions you can offer.
here is some example data to apply the code to above: Dropbox - query_example.csv - Simplify your life
(also posted here: https://github.com/davidanthoff/Query.jl/issues/134)