Few days ago I get some interests on Query.jl, and I found it incredible exciting to implement it on my own and finally I did.
I’m here to present this article, Write You A Query Langauge, where I made a tutorial for MLStyle.jl to process complex macros gracefully and brought about a distinct prototype and some ideas for query languages in Julia.
I don’t mean to announce a new package about query language, although my prototype might be quite clean, efficient, robust and provided with more functionalities and better extensibilities, this is mainly aimed at telling you about some super useful tools for AST manipulations.
The main advantages of this prototype(called MQuery
, tentatively) are
-
Support field names that’re not regular identifiers, e.g.,
@select _."middle school"
. -
No
NamedTuple
usage when executing computations, it might lead to better efficiency when the scale becomes extreme large. -
Unlike providing some fixed syntactic supports like
startswith
,endswith
,occursin
in Query.jl, MQuery allows arbitrary fieldname filter functions, e.g.,_.(predicate1(a, b, c), !predicate2(d, e))
means that a selected fieldfield
must satisfypredicate1(field, a, b, c) && !predicate2(field, d, e)
. -
Support to extend query commands like
@select
,@where
, and so on via registered_ops. -
Support closure reference in arbitray query expression.
-
Allow you to add custom query support for your own datatypes via 3 interface functions: get_fields, get_records, build_result. Codes below implement the query for
DataFrame
s:get_fields(df :: DataFrame) = names(df) get_records(df :: DataFrame) = zip(DataFrames.columns(df)...) function build_result(::Type{DataFrame}, fields, typs, source :: Base.Generator) res = Tuple(typ[] for typ in typs) for each in source push!.(res, each) end DataFrame(collect(res), fields) end
-
The code base is pretty small, and MQuery just depends on
MLStyle.jl
(depends on nothing) andDataStructures.jl
. So MQuery’s friendly for you to integrate with.
The reason why MQuery could be that powerful is, MLStyle.jl could help with all sorts of AST manipulations and make them become extremely easy, as a result, I can focus myself on other aspects like functionalities and extensibilities.
include("MQuery/MQuery.jl")
using Base.Enums
@enum TypeChecking Dynamic Static
df = DataFrame(
Symbol("Type checking") => [
Dynamic, Static, Static, Dynamic, Static, Dynamic, Dynamic, Static
],
:name => [
"Julia", "C#", "F#", "Ruby", "Java", "JavaScript", "Python", "Haskell"
],
:year => [
2012, 2000, 2005, 1995, 1995, 1995, 1990, 1990
]
)
df |>
@where !startswith(_.name, "Java"),
@groupby _."Type checking" => TC,
@having TC === Dynamic,
@select join(_.name, " and ") => result
│ Row │ result │
│ │ String │
├─────┼───────────────────────────┤
│ 1 │ Julia and Ruby and Python │