Julia 1.0.3 segfaults when trying to read a csv DataFrame with non-quoted headers and strings

tanhevg · March 6, 2019, 8:08pm

Hello,

My code goes:

using Queryverse
df = load("my_file.csv") |> DataFrame

And the file contents are:

query_domain,template_domain,tm_score
d1qbaa2,d2a73b2,0.52434
d1qbaa2,d2wnxa1,0.51702
d1qbaa2,d1p35a_,0.50272
d1qbaa2,d3zuca1,0.50237
d1qbaa2,d4b9pa_,0.49787
d1qbaa2,d3u9wa1,0.49737
....

The Julia process randomly segfaults when trying to execute this code. By “Randomly” I mean that I have lots of files with similar structure, and segfaults reliably occur on some of them , but not others. So far I have not been able to figure out what is special about the files that cause the segfault.

Wrapping all string tokens in the file in double quotes gets rid of the segfault.

I am running Julia v1.0.3 on Linux, Queryverse v0.2.0. I wonder, if this has bee reported before?

cstjean · March 7, 2019, 1:30am

I’ve had several segfaults on Julia 1.0.x, but they’ve been progressively patched. I would recommend trying Julia 1.1.0 if possible. It fixed at least one scary production segfault for us.

uadjet · March 7, 2019, 6:02am

Not really a solution, but I had a similar issue a few months ago. I ended up using CSV.jl (as opposed to Queryverse’s ‘CSVFiles.jl’) to import the files and then piping that to a dataframe. Definitely a workaround. I’d first do as @cstjean suggested and moving to 1.1.0 before I tried to rewrite any code.

Topic		Replies	Views
Fatal error while reading in messy data using DataFrames, CSV Data dataframes , csv	6	626	May 25, 2021
CSV.jl writing quoted strings General Usage question , csv	14	94	December 19, 2024
Problem related to CSV.read() and strings General Usage	8	2944	March 11, 2018
Problem in opening dataframe General Usage	4	262	August 26, 2021
CSV Segmentation fault in project environment General Usage question	3	564	August 3, 2020

Julia 1.0.3 segfaults when trying to read a csv DataFrame with non-quoted headers and strings

Related topics