Thanks for suggestions! For now comprehensions seems like the best easy choice
for n in names(df)
df[!,n] = [x for x in df[!,n]]
end
Explicitly using type of the first element like typeof(df.a[1]).(df.a) (note typeof instead of eltype as was suggested - so that it works for arrays as well) is definitely less general. E.g. it doesn’t work for Union{..., Nothing} which is pretty common, and other small unions which are handled well by comprehensions.
For larger datasets where performance is important it would be better to have a helper function to skip columns which already have proper types. Unfortunately, I don’t think it’s possible to determine if the type is correct without checking all values anyway…
I wonder if this would be a nice feature to be built into DataFrames. Something like narrowtypes!(df) which in simplest form does your loop, but could be made more efficient by skipping any column which already has a concrete type. Like this,
for n in names(df)
isconcretetype(eltype(df[!, n])) && continue
df[!,n] = [x for x in df[!,n]]
end