As I understand it, you are doing something like
julia> using Arrow, CategoricalArrays, DataFrames
julia> df = DataFrame(a = 1:4, b = string.('a':'d'), c = categorical(["x", "x", "y", "y"]))
4×3 DataFrame
Row │ a b c
│ Int64 String Cat…
─────┼─────────────────────
1 │ 1 a x
2 │ 2 b x
3 │ 3 c y
4 │ 4 d y
julia> afn = Arrow.write("./df.arrow", df)
"./df.arrow"
julia> df1 = DataFrame(Arrow.Table(afn))
4×3 DataFrame
Row │ a b c
│ Int64 String String
─────┼───────────────────────
1 │ 1 a x
2 │ 2 b x
3 │ 3 c y
4 │ 4 d y
julia> typeof(df1.c)
Arrow.DictEncoded{String, Int8, Arrow.List{String, Int32, Vector{UInt8}}}
It won’t be the case that you can “round trip” DataFrame → Arrow → DataFrame and get the same types. Is there a reason that you need a CategoricalArray instead of the Arrow.DictEncoded result. The Arrow.DictEncoded result can in some circumstances take up less storage than the CategoricalArray, because it uses the smallest signed integer type available for the refarray (Int8 in this case).
Arrow.DictEncoded is more like a PooledArray than a CategoricalArray but often the distinctions are not important. They can be important for ordered categorical arrays. I think it is still the case that the Arrow.Table function does ignores whether DictEncoded arrays in the Arrow file have ordered categories.