I am currently processing a dataset of 31 columns and 100.000 rows. Loading the data, describe them etc… is ok however while running some Query to match some rows is very slow. 100K rows is big but still not very big (might be mistaken here).
At first, I thought I am in Jupyter notebook and/or Jupyter lab therfore i am experimenting latency but then i run a *.jl script and it is still the same.
I then have started julia as julia -p 16 (in a server with 16 cores) and then run the script but still very slow.
I am currently using Query.jl to do my queries but am assuming it is going to be the same for others (need to see).
I have not played with multiprocessing on julia jet (thought it will do a bit of magic for me automatically)…
Anybody haveing a solution on how to use multiprocessing with tables/dataframes ?
My query is as follows:
@from row in df begin
@where ((row.X< 10 && row.Y > 15) || (row.X > 15 && row.Y < 10)) && (row.O > 10)
and it does not seem to be a complex one…