KeyError when writing Parquet files to S3

a.frist · April 8, 2022, 1:09pm

It turns out that the S3 write component cannot handle “missing” columns. I used the following code to change the values in the files to NaN.

@rtransform!(my_df, :MY_COLUMN = ismissing(:MY_COLUMN) ? NaN : :MY_COLUMN)

I applied the above row transform macro to all columns that only contained missing values.

Topic		Replies	Views
Unable to write DataFrame to Parquet or Arrow? Data question	7	607	July 27, 2021
Converting CSV to Parquet in Julia New to Julia question , csv , parquet	22	1569	August 11, 2024
Displaying a parquet file in Arrow New to Julia dataframes , parquet , arrow	7	1553	March 17, 2021
Error displaying ParquetFiles.ParquetFile: ArgumentError: reducing over an empty collection is not allowed General Usage question , package	0	413	June 15, 2020
Write_parquet with non-standard types New to Julia	2	584	July 12, 2020