Whenever I try to use more than two @let statements in my query I get an error (Julia version 0.6.0, Query.jl 0.6.0).
DataFrame sink:
type TypeofBottom has no field parameters
Array sink:
TypeError: getfield: expected Symbol, got Expr
Code that generates error:
using DataFrames, Query
df = DataFrame(name=["John", "Sally", "Kirk"], age=[23., 42., 59.], children=[3,2,2])
x = @from i in df begin
@let count = length(i.name)
@let kids_per_year = i.children / i.age
@let isjohn = i.name == "John" ? 1 : 0
@where count > 4
@select {i.name, Count=count, KPY=kids_per_year}
@collect DataFrame
end
Commenting out the third @let avoids the issue…any ideas?
Thanks!
This appears to be a bug in Query.jl. AFAICT, the find_names_to_put_in_scope
expansion of transparentidentifier
seems to be fairly hairy, but has never been tested with that many variables, and does not normalize them correctly to a.b
expressions if it recurses. There’s an insufficient amount of comments and whitespace in that code base for me to dig further just now, though.
I recommend opening an issue on Query.jl with your repro bug report. I think it should be enough for David to go on to fix the issue.
1 Like
Thanks for looking into this, I just posted the issue: #133
Thanks, I’ll try to take a look soon, but it might be a couple of days, this week is a bit hectic with other stuff.
I get the same error with:
data = @from u in userdata begin
@join s in screens on u.Screen_Id equals s.screen_id
@group u by u.Date into a
@select {a.screen_name, Count=length(a)}
@collect DataFrame
end
“userdata” and “screens” are both CSV.source
s with weakrefstrings set to false.
I think the main issue here is that a.screen_name
is not valid in this query. The @group
clause will create a stream of Grouping
s, i.e. each a
will be an instance of Grouping
. Grouping
won’t have a field a
, though. It will have one field key
that will hold the value of u.Date
, and then it is like a vector, where each element is an instance u
.
Could you maybe post the structure of the input tables? I.e. what columns they have? That would make it easier to understand what you are trying to do. I’ll be offline until Tue, though, so will take a while to respond.