Newbie using CSV with categorical=true

How do I use the categorical=true feature of CSV.validate(…)?
I’ve tried a bunch of variations, and keep getting the error “ERROR: TypeError: in setfield!, expected Union{Missing, CategoricalString{UInt32}}, got String”.

I have two test files called “data/tmp1.csv” and “data/tmp2.csv”. The contents tmp1 are the two lines:

The contents tmp2 are the two lines:

I’ve defined the following variables for use in the CSV.validate(…) call.

I’ve tried all the following variations, plus others, and still get the error above. Can someone tell how to do this correctly?

CSV.validate(“data/tmp1.csv”,header=colNames, types=colTypes, categorical=true, allowmissing=:none, strict=true)
CSV.validate(“data/tmp2.csv”,header=colNames, types=colTypes, categorical=true, allowmissing=:none, strict=true)
CSV.validate(“data/tmp1.csv”,header=colNames, types=colTypes2, categorical=true, allowmissing=:none, strict=true)
CSV.validate(“data/tmp2.csv”,header=colNames, types=colTypes2, categorical=true, allowmissing=:none, strict=true)
CSV.validate(“data/tmp1.csv”,header=colNames, types=colTypes, categorical=true, allowmissing=:none, strict=true)
CSV.validate(“data/tmp1.csv”,header=colNames, types=colTypes, typemap=tmpmap, categorical=true, allowmissing=:none, strict=true)

Thanks for your attention and help!

Please see the bottom half of this thread:

Ask for more help if that does not get your stuff working.

I tried “add CSV#master” from pkg mode as suggested near the end of that thread, and that step succeeded, but I still get the same error.

What version of Julia are you using, and is it on Linux OSX or Win?


gives that and more – would you post what it reports?

I’m on a Macbook Pro, OSX 10.13.6 (17G65).

Julia Version 1.0.1

Commit 0d713926f8 (2018-09-29 19:05 UTC)

Platform Info:

OS: macOS (x86_64-apple-darwin14.5.0)

CPU: Intel(R) Core™ i5-5257U CPU @ 2.70GHz


LIBM: libopenlibm

LLVM: libLLVM-6.0.0 (ORCJIT, broadwell)


JULIA_EDITOR = atom -a


It might need something like colTypes=Dict(:senID=>Union{String, Missing}), but you could also try giving the file a header line and using less params, ie CSV.validate(“data/tmp2.csv”, categorical=true)

If all else fails, you might be able to call CategoricalStrings on a readdlm output

using DelimitedFiles
testfile = IOBuffer("0001000\n0001000");
readdlm(testfile, ',', String, header=false)
> "0001000"

That’s just a bug, I’ve filed an issue.

As a workaround, note that categorical=true shouldn’t make any difference for validation, so you can just drop that argument. It’s mostly useful for, and there it works AFAICT.

1 Like
