In addition to fast, user-defined functions, as @xiaodai mentioned, there are at least two other advantages to being able to write queries in Julia. I use Polars at work, so I will provide examples in Polars, but the same could be done with Julia queries.
Code reuse
Code reuse in SQL is difficult, but it is easy in Polars. Here’s an example:
def shift_over(col_name, n):
return (
col(col_name)
.shift(n)
.over(
partition_by = ["serial_number", "trip"],
order_by = "timestamp"
)
)
df2 = df.with_columns(
x_lag_1 = shift_over("a", 1),
x_lead_1 = shift_over("a", -1),
y_lag_1 = shift_over("b", 1),
y_lead_1 = shift_over("b", -1)
)
Programmatically generate columns
It’s easy to programmatically generate new columns in Polars. Not so easy in SQL. Here’s an example in Polars that has both code reuse and programmatically generated columns:
def forward_circular_shift(col_name, n):
return (
col(col_name)
.tail(n)
.append(
col(col_name).head(pl.len() - n)
)
.alias(col_name + f"_lag_{n}")
)
df2 = (
df
.with_columns(
[
forward_circular_shift(col_name, n)
for col_name in ["a", "b", "c", "d", "e"]
for n in [1, 2, 3, 4, 5]
]
)
)