Suggestion: move DataFrames, plotting into standard distribution

davidanthoff · February 20, 2018, 10:20pm

So at least for Query.jl I’m trying hard to not change the semantics of syntax that is valid outside the macro when it is used inside the macro. So that is why I don’t want to treat every name as a column name, and then make an exception for function calls inside these macros.

The reason I have _.foo right now is that of course Query.jl doesn’t even know what a table is Or a column, for that matter. Really the only syntax change that happens in the macro is that _ stands for the current element, and if that happens to be a named tuple everything looks like a table story automatically. But _ can also be anything else, from a tuple, to some custom type, to a scalar etc, depending on what you query.

Having said that, the more I think about the $foo option as a shortcut for _.foo, the more I like it…

pdeffebach · February 20, 2018, 10:24pm

Weird question, but do you have a sound in mind for _?

I know that the tidyverse team had trouble with dplyr for a while because individuals hated writing %>% since they couldn’t say it in their head. They realized they had to teach people to read %>% as “and” or “then” to make typing easier.

Should I read _.a this.a?

davidanthoff · February 20, 2018, 11:00pm

Ha, haven’t thought about that at all! this might be fine… Or current, or in the case of a table row? I’m not sure, but it is a good point…

bramtayl · February 21, 2018, 1:58am

I think if we are looking for anything even vaguely generic, we’ll have to prefix column names by something in these kind of packages

While it was working, LazyQuery solved this problem. It had to go through hoops to do it. To use it correctly, you had to use LazyContext on regular code so that variables got assigned to and taken from a dict, instead of the global namespace. LazyContext ballooned until it became basically R build in julia. R assigns variables to a dynamic nested environment that can be modified and accessed. Code can be evaluated dynamically in any scope you wish. This is patently impossible in vanilla Julia (and I still maintain that base Julia mismarkets itself as fully featured replacement for R). There was a couple of problems:

Performance problems (avoidable with typical R vectorization/lazy evaluation tricks)
Maintenance. Supporting code that can transform all of Julia syntax to R semantics was really tricky.

So TLDR I agree, but it took me a while to get there. If people are interested I could rehabilitate LazyQuery once 1.0 stabilizes.

bramtayl · February 21, 2018, 2:01am

That’s basically how DataFramesMeta works now, with :foo instead. I’m sure it wouldn’t be too hard to get DataFramesMeta surface syntax to yield Query-style iterators.

davidanthoff · February 21, 2018, 3:31am

Yeah, LazyQuery.jl was awesome, but in the end my main takeaway was that doing this in julia is just too involved. That was an important insight, though, and shaped my thinking about this a lot.

Topic		Replies	Views
Please recommend a Julia ecosystem for Statistics New to Julia	28	4237	June 8, 2019
Choosing a Plotting Base to Build a Statistical Plotting Package Visualization	19	3881	August 4, 2021
Help me with Julia/R comparison project, how to plot US county maps in Plots/StatsPlots (or what's best alternative)? Visualization plotting	18	1820	August 26, 2022
How to choose a plotting package? New to Julia question , plotting	55	16195	June 9, 2017
What is the status of the Plots ecosystem and what package should I use? Visualization	11	3758	April 6, 2020

Suggestion: move DataFrames, plotting into standard distribution

Related topics