Query.@orderby destroys DataFrame

hmmueller · August 2, 2019, 2:05pm

This code does not work:

using DataFrames, Query
df = DataFrame(A=[1,2,3], B=[4,5,6]);
fromdata = @from i in df begin
           @where i.B > 3
           @select {i.A}
           @orderby i.A
           @collect DataFrame
        end 
println(typeof(fromdata))
println(nrow(fromdata))
println(typeof(fromdata.A))

However, when you put a comment # in front of @orderBy, it works. Is this intended - why? (is an ordered DataFrame no longer a DataFrame?) … or is it a bug?

// Edit: Creating an explicit DataFrame again makes it work:

...
fromData = DataFrame(@from ... @orderby ... end)
...

// Edit: Rewritten question so that a naive copy/paste of the code should show the error.

jling · August 2, 2019, 2:14pm

where are these macros coming from?

hmmueller · August 2, 2019, 2:17pm

Query (I added the using in my question).

cormullion · August 2, 2019, 3:16pm

Odd, it works for me. (DataFrames v0.19.1, Query v0.12.0.)

julia-1.3> using DataFrames, Query

julia-1.3> df = DataFrame(A=[1,2,3], B=[4,5,6]);

julia-1.3> fromdata = @from i in df begin
                      @where i.B > 3
                      @select {i.A}
                      @orderby i.A
                      @collect DataFrame
                   end
3x1 query result
A
─
1
2
3

First suggestion - a restart. I see many world age errors when I make mistakes in Query…

I’m not sure, but is this alternative syntax equivalent?

df |> @filter(_.B > 3) |> @select(:A) |> @orderby(_.A) |> collect |> DataFrame

hmmueller · August 24, 2019, 5:58pm

Thanks - I went down some other track (using sort!) … maybe it’s a problem with versions - I was using 1.1 (and am now on 1.2), you tried it in 1.3. I also did not check which package version I use(d) …

xin-jin · August 24, 2019, 7:37pm

I just tried your code in 1.2, and it works.

davidanthoff · August 25, 2019, 6:12pm

Yeah, I also can’t replicate that error.

hmmueller · August 25, 2019, 6:21pm

… when commenting in the @orderby? I get it “reliably” (in my Julia 1.1.1 installation - I have not yet upgraded … //edit: but let me upgrade - maybe I’m happy then!) - see the image below. More readably the output for the first two println()s is:

julia> println(typeof(fromdata))
QueryOperators.EnumerableMap{NamedTuple{(:A,),Tuple{Int64}},QueryOperators.EnumerableFilter{NamedTuple{(:A, :B),Tuple{Int64,Int64}},QueryOperators.EnumerableIterable{NamedTuple{(:A, :B),Tuple{Int64,Int64}},Tables.DataValueRowIterator{NamedTuple{(:A, :B),Tuple{Int64,Int64}},Tables.RowIterator{NamedTuple{(:A, :B),Tuple{Array{Int64,1},Array{Int64,1}}}}}},getfield(Main, Symbol("##2384#2386"))},getfield(Main, Symbol("##2385#2387"))}

julia> println(nrow(fromdata))
ERROR: MethodError: no method matching nrow(::QueryOperators.EnumerableMap{NamedTuple{(:A,),Tuple{Int64}},QueryOperators.EnumerableFilter{NamedTuple{(:A, :B),Tuple{Int64,Int64}},QueryOperators.EnumerableIterable{NamedTuple{(:A, :B),Tuple{Int64,Int64}},Tables.DataValueRowIterator{NamedTuple{(:A, :B),Tuple{Int64,Int64}},Tables.RowIterator{NamedTuple{(:A, :B),Tuple{Array{Int64,1},Array{Int64,1}}}}}},getfield(Main, Symbol("##2384#2386"))},getfield(Main, Symbol("##2385#2387"))})
Closest candidates are:
  nrow(::DataFrame) at C:\Users\h.mueller\.julia\packages\DataFrames\VrZOl\src\dataframe\dataframe.jl:278
  nrow(::SubDataFrame) at C:\Users\h.mueller\.julia\packages\DataFrames\VrZOl\src\subdataframe\subdataframe.jl:114
Stacktrace:
 [1] top-level scope at none:0

.

hmmueller · August 25, 2019, 6:31pm

No - also does not work with 1.2.0 (DataFrames is 0.19.2, Query is 0.12.1).

(Edit: I have now rewritten the original question so that the code shown shows the error - just to make sure that people who copy it without reading the earlier instruction to remove the # will “hopefully” get the error).

Topic		Replies	Views
Ways to order a `DataFrame` by mixed order? Data	2	555	March 17, 2019
How does @orderby work with multiple columns? Data question , dataframesmeta	3	359	October 3, 2022
Sort rows in a dataframe based on a predefined order New to Julia sort , dataframes	5	1833	September 17, 2021
Sort DataFrame by the greater of multiple columns New to Julia question , sort , dataframes	5	523	January 4, 2023
DataFrames: obtaining the subset of rows by a set of values New to Julia dataframes	45	24043	April 27, 2024

Query.@orderby destroys DataFrame

Related topics