Seems to be confused about disk storage format and Julia DataFrames which are in memory? You can save the in memory dataframe in a number of different formats like parquet CSV. But to dispaly you need to load into RAM and likely in the format of DataFrames?
Another approach to converting a CSV in Parquet file would be like this.
using CSV, DataFrames, Arrow
df = CSV.read("FILE_PATH", DataFrame)
prq = Arrow.write("newfile.parquet", df)
1 Like
With TidierFiles.jl there are a couple ways you can do it
using TidierFiles
mtcars_path = "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
destination_path = "/path/to/output/test_car2.parquet"
# saves an intermediate
df = read_csv(mtcars_path)
write_parquet(df, destination_path)
# no intermediate saved
write_parquet(read_csv(mtcars_path), destination_path)
With Tidier, for example, @chain
is reexported, allowing you do to the following as well. Once again, without saving to a local intermediate.
using Tidier
@chain begin
read_csv(mtcars_path)
write_parquet(_, destination_path)
end
test your results
read_csv(destination_path) #errors
read_parquet(destination_path) #succeeds
2 Likes