Enabling StatsBase functions in SQLite (Sqlite3Stats)

Hi guys,

I am not sure if it is really needed but it is useful (at least for me) to have aggregative functions such as correlation, skewness, kurtosis, covariance, median absolute deviations, median, geometric and harmonic means, quartiles, inter-quartiles range, etc, to easy use in SQL select statements in SQLite and Julia. So I created a “register-functions-and-go” package for this purpose. The usage is simple and it shares the same way of calling of other aggregative functions such like SUM, COUNT, AVG, etc.

In its current form, I registered the basic functions, so an SQL statement like

select MEDIAN(x) from Table

works fine, thank to the StatsBase package. It may be expanded to cover more functions so any researcher is freed re-registering functions in their custom works. The main logic underlying this design is to register function like

SQLite.register(db, [], 
        (x,y) -> vcat(x, y), 
        x -> StatsBase.quantile(x, 0.50), 
        name = "MEDIAN")

for single variable functions and

SQLite.register(db, Array{Float64, 2}(undef, (0, 2)), 
        (x, a, b) -> vcat(x, [a, b]'), 
        x -> StatsBase.cov(x[:,1], x[:,2]), 
        name = "COV", nargs = 2)

for multiple-variable functions.

Please let me know what you think and any contributions are welcome.

The repo is: https://github.com/jbytecode/Sqlite3Stats

README.md file includes the basic usage.

All the best and have a nice new year!

2 Likes

Happy to inform that I’ve just registered the package and it is available now. You can install and use it in the usual way. The README.md file provides all of the necessary information to use the package.

Github page: Sqlite3Stats.jl
A short doc: README.md

Installation:

julia> using Pkg
julia> Pkg.add("Sqlite3Stats")

This package is not a big deal but I hope it may help to those who want to query

SELECT MEDIAN(x) from Table

like queries in SQLite.

Have a nice vacation!

7 Likes