Metaprogramming Queryverse/Query Pipelines

Hello,

I haven’t done any real metaprogramming yet (other than tinkering around with it) and I have a situation that I think could be a good fit for a metaprogramming solution. I’m working with a dataset in Queryverse and I’m running the same pipeline numerous times for different variables so I’d like to wrap it in a function and then be able to pass in the variable that I want:

# For example, to filter by age, I would do this:

by_age = naws |>
    @groupby(_.AGE) |>
    @map( { AGE= key(_), Count = sum(_.PWTYCRD) } ) |>
    @orderby(_.AGE) |>
    DataFrame

# And to filter by gender, I would do this:

by_gender = naws |>
    @groupby(_.GENDER) |>
    @map( { GENDER= key(_), Count = sum(_.PWTYCRD) } ) |>
    @orderby(_.GENDER) |>
    DataFrame

# I tried to do this, but I get an error:

function group_map(user_var::String)
    return @eval :(naws |>
    @groupby(_.$user_var) |>
    @map( { $user_var = key(_), Count = sum(_.PWTYCRD) } ) |>
    @orderby(_.$user_var) |>
    DataFrame)
end

group_map("AGE")

# Output:

ERROR: UndefVarError: user_var not defined

Does anyone know how to make this work?

So I have another clue, which is shining a light on how much of a train wreck the above code is :dizzy_face:. When I define a user_var variable in the global scope, and then run the function, I get:

user_var = "AGE"
group_map("AGE")

# Output:

:((((naws |> #= REPL[69]:3 =# @groupby(_.:("AGE"))) |> #= REPL[69]:4 =# @map({"AGE" = key(_), Count = sum(_.PWTYCRD)})) |> #= REPL[69]:5 =# @orderby(_.:("AGE"))) |> DataFrame)

So it appears that @eval is looking for a global variable when it evaluates user_var and it’s not going to be able to get this value from the function argument. It also appears that my function is just returning an expression without actually evaluating it…

Metaprogramming for me is kind of this elusive Shangri-La that I just can’t seem to nail down :grin:

Try

function group_map(user_var::Symbol)
    return :(naws |>
    @groupby(_.$user_var) |>
    @map( { $user_var = key(_), Count = sum(_.PWTYCRD) } ) |>
    @orderby(_.$user_var) |>
    DataFrame)
end
eval(group_map(:AGE))
1 Like

I’m getting closer to the Shangri-La of metaprogramming…I’ve just had a glimpse of it:

user_var = :(AGE)

@eval begin
    naws |>
        @groupby(_.$user_var) |>
        @map( { $user_var = key(_), Count = sum(_.PWTYCRD) } ) |>
        @orderby(_.$user_var) |>
        DataFrame
end

# Output:

77×2 DataFrame
│ Row │ AGE      │ Count    │
│     │ Float64⍰ │ Float64  │
├─────┼──────────┼──────────┤
│ 1   │ 14.0     │ 234.899  │
│ 2   │ 15.0     │ 408.661  │
│ 3   │ 16.0     │ 882.28   │
│ 4   │ 17.0     │ 1399.91  │
│ 5   │ 18.0     │ 2210.2   │
│ 6   │ 19.0     │ 2410.58  │
│ 7   │ 20.0     │ 2475.35  │
│ 8   │ 21.0     │ 2553.29  │
│ 9   │ 22.0     │ 2291.98  │
│ 10  │ 23.0     │ 2426.32  │
⋮

Now, I just need to figure out how to make this work in a function call…or if I can figure out how to use metaprogramming to generate functions for all of the variables I’m interested in, full Nirvana will be achieved and I think I’ll just retire. :stuck_out_tongue_winking_eye:

This is a nice solution! I’m wondering if, instead of having to do eval(), I can write a macro @group_map that would do the same thing? So the final solution would be @group_map :AGE, or something like that.

So this appears to work:

function group_map(user_var::Symbol)
    @eval begin
        naws |>
            @groupby(_.$user_var) |>
            @map( { $user_var = key(_), Count = sum(_.PWTYCRD) } ) |>
            @orderby(_.$user_var) |>
            DataFrame
    end
end

group_map(:AGE)

# Output:

77×2 DataFrame
│ Row │ AGE      │ Count    │
│     │ Float64⍰ │ Float64  │
├─────┼──────────┼──────────┤
│ 1   │ 14.0     │ 234.899  │
│ 2   │ 15.0     │ 408.661  │
│ 3   │ 16.0     │ 882.28   │
│ 4   │ 17.0     │ 1399.91  │
│ 5   │ 18.0     │ 2210.2   │
│ 6   │ 19.0     │ 2410.58  │
│ 7   │ 20.0     │ 2475.35  │
│ 8   │ 21.0     │ 2553.29  │
│ 9   │ 22.0     │ 2291.98  │
│ 10  │ 23.0     │ 2426.32  │
⋮

I :heart: Julia

Just in case anyone else is interested in this, I’ve taken it a step further and it’s getting really cool (at least for my use case :slightly_smiling_face:):

function filter_group(filter_expr::Expr, group_var::Symbol)
    @eval begin
        naws |>
            @filter($filter_expr) |>
            @groupby(_.$group_var) |>
            @map( { $group_var = key(_), Estimate = sum(_.PWTYCRD) } ) |>
            @orderby(_.$group_var) |>
            DataFrame 
    end
end

# You can then call the function with whatever filtering criteria/grouping variable you need:

filter_group(:(_.FLC == 1.0), :WAGET1)

You can most certaily can; should be quite straightforward :wink:

1 Like