So far, I don’t see any additional information provided, so perhaps I’ll clarify my point of view. As I understand it, the question was about “parallel feeding of a dataframe,” with the additional detail that the data is “inside CSV files.” I still maintain that a parallel filesystem, is the closest solution that comes to mind to address this problem. As for the other suggestions, short summary below:
Apache Arrow is an in-memory columnar data format. Very good Julia support.
Apache Parquet is a columnar storage file format. Very good Julia support.
HDF5 is a file format designed for storing hierarchical data structures. Very good Julia support.
QuestDB is a high-performance, open-source time-series database optimized for time-stamped data. Julia support could be slightly better.
DuckDB is an embedded, in-process OLAP database designed for analytical workloads. Really great Julia support.
Umbra is a research-oriented, high-performance relational database derived from HyPer, focusing on in-memory processing and advanced query execution. As far as I know there is no Julia support. However, it is PostgreSQL compatible.
ClickHouse is a distributed, columnar OLAP database designed for large-scale analytics. It is is very fast and versatile, however, achieving its full performance can be really time consuming. As far as I know, two Julia packages are present.
Distributed is a Julia “package” providing tools for distributed parallel processing.
I’m very sorry, I’m afraid I don’t have anything to add at this very moment, however, maybe a MWE (Minimal Working Example) or maybe my colleagues might have some additional suggestions complementing the ones they provided above.