CSV.jl "corrupts" data when a field is very large

Here’s another possible option:

Also, have you tried using CSV with ntasks=1. Multi-threading in CSV sometimes has some issues (e.g. with a \n character in a quoted string) and working single threaded can avoid these.