I have a very large database
A, and a much smaller one
A is so large, I cannot load it in memory at once. I keep
A as a gzip compressed CSV file on disk, and I want to load it lazily. On the other hand
B is small and fits well into memory.
The goal is to perform a
join between them, on a certain column shared by both
B. See this example: https://juliadata.github.io/DataFrames.jl/stable/man/joins/.
How can I do this, without ever loading
A fully into memory?