Strategies for reducing number of queries to a DB

This might not be a Julia question per se, but I didn’t know whether or not it should be here or Off-Topic.

Say I am writing a package for my company’s corporate reporting, and I have a large table in a database.
If I need to derive two different metrics from the same dataset and create a function to derive each metric, how could I avoid querying the database in each function call and thus having to wait for the query to return twice when compiling the report?

Right now I am doing something like this

module foo
__precompile__(false)

function querythedata()
# Wrap some sql
end

const data = querythedata()

function metric1()
# Create a metric from data
end

function metric2()
# Create another metric from data
end

end

This seems a bit clumsy to me but I can’t for the life of me think of a better way to do this.

I suspect the reason you can’t just do

function main()
  data = querythedata()
  metric1(data)
  metric2(data)
end

is because you are trying to have this data to be predefined in the module somehow ??
the other way to do this would be to have the DB query return a dataframe. then you can easily pass the dataframe around and do all sorts of interesting calculations on it in a “DB style”.

2 Likes

Yeah unfortunately the raw data needs to be exposed to the user. Big organizational attachment to R, so shipping data with internal packages has become common. Unfortunately, we didn’t think of the overhead that creates when the DataFrame constant gets to be large.