Reading data with 0 row groups

phchavesmaia · March 24, 2025, 7:07pm

Hello everyone,

I am trying to work with big data in Julia. I just exported a .parquet file/folder using python’s dask library and wanted to load it in Julia. It follows a representation of my code

using Parquet2: Dataset

ds = Dataset("path/to/my/file.parquet/")
df = DataFrame(ds; copycols=false)

Julia reports that the ds object relates to a dataset amounting to 22146824 bytes – so it is definately non-empty. I also cross-check this information using python, where the data is properly read by dask.

However, the df object is an empty dataframe and when I run println("Number of row groups: ", length(ds.row_groups)) I get that there are 0 row groups, which is unexpected.

Does anyone have any tips or insights to share?

Bestest,
P.

Topic		Replies	Views
[ANN] Parquet2.jl Package Announcements data , parquet , tables , serialization	20	7395	May 8, 2024
Reading parquet very slow Data	4	3367	June 14, 2020
Neither Parquet.jl nor Parquet2.jl can read my .parquet file Data	7	839	August 31, 2022
Parquet2.jl: type Nothing has no field meta_data Data parquet	6	232	April 29, 2024
Parquet: writing data as row groups Data question	1	165	July 22, 2024

Reading data with 0 row groups

Related topics