Different behaviour between CSV.write() and DataFrame's writetable()

IljaK91 · February 27, 2018, 12:32pm

I am using writetable() at the moment to create write CSVs from my DataFrames and it works just fine. Since writetable() will be deprecated as soon as Pkg.update() will give me DataFrames 0.11, I want to use CSV.write() instead. I get the following error however:

TypeError: streamto!: in typeassert, expected Float64, got DataArrays.NAtype
streamto!(::CSV.Sink, ::Type{DataStreams.Data.Field}, ::DataFrames.DataFrame, ::Type{Float64}, ::Type{Float64}, ::Int64, ::Int64, ::DataStreams.Data.Schema{true}, ::Base.#identity) at DataStreams.jl:173
stream!(::DataFrames.DataFrame, ::Type{DataStreams.Data.Field}, ::CSV.Sink, ::DataStreams.Data.Schema{true}, ::DataStreams.Data.Schema{true}, ::Array{Function,1}) at DataStreams.jl:187
#stream!#5(::Array{Any,1}, ::Function, ::DataFrames.DataFrame, ::Type{CSV.Sink}, ::Bool, ::Dict{Int64,Function}, ::String, ::Vararg{String,N} where N) at DataStreams.jl:151
#write#36(::Bool, ::Dict{Int64,Function}, ::Array{Any,1}, ::Function, ::String, ::DataFrames.DataFrame) at Sink.jl:150
write(::String, ::DataFrames.DataFrame) at Sink.jl:150
include_string(::String, ::String) at loading.jl:522
include_string(::String, ::String, ::Int64) at eval.jl:30
include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N} where N) at eval.jl:34
(::Atom.##100#105{String,Int64,String})() at eval.jl:75
withpath(::Atom.##100#105{String,Int64,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
didWriteToREPL(::Atom.##99#104{String,Int64,String}) at repl.jl:129
hideprompt(::Atom.##99#104{String,Int64,String}) at repl.jl:65
macro expansion at eval.jl:73 [inlined]
(::Atom.##98#103{Dict{String,Any}})() at task.jl:80

As I said before, the same dataframe printed without problems using writetable(). As I understand, the issue is that I have columns that contain NAs. What do I need to change to make it work?

EDIT: I already found the problem. The main issue is that I switch back and forth between NaNs and NAs in my dataframes, since different packages treat them differently. For example, here I need to have NaNs in some columns for CSV.write() to accept it. At the same time, if I run a regression using GLM.jl, the row with NaNs is only correctly skipped, if I replace NaNs with NAs. I didn’t look to deeply into this issue, since this solved the problem I had and I stopped using Julia for regressions (mainly due to poor options for applying different kinds of Standard Errors and automatic printing of Output tables to LaTeX files).

Topic		Replies	Views
Write CSV with nulls from dataframe Data question	3	750	January 5, 2018
New behaviour due to an update of the package CSV when using CSV.write Performance question	7	1227	July 17, 2019
CSV : problem to write big dataframes Data csv	20	2812	May 29, 2023
How to read a table with missing values using CSV? Data package , data	9	7381	February 20, 2018
How do I export strings to CSV file？ General Usage question	8	1276	September 19, 2020

Different behaviour between CSV.write() and DataFrame's writetable()

Related topics