julia> @enum Fruit a b c
julia> Integer.([a,b,c])
3-element Vector{Int32}:
0
1
2
# when you read back from Arrow, re-make the enums
julia> Fruit.(Integer.([a,b,c]))
3-element Vector{Fruit}:
a::Fruit = 0
b::Fruit = 1
c::Fruit = 2
I understand that, but I don’t want to do that. I want to do Arrow.Table(io) and get my objects properly deserialized. Of course I can cast manually, but that’s not what I want.
How do you propose that Arrow.jl will know that the file contains integers that refer to a particular Enum and not just integers that refer to numbers?
On the whole, the interface is not super intuitive. The docs don’t say what valid ArrowTypes.ArrowType values are, among other things. They mention "natively supported arrow type"s, but from the linked Apache Arrow documentation it’s not easy to find what those are either.
They do say
This stuff can definitely make your eyes glaze over if you stare at it long enough. As always, don’t hesitate to reach out for quick questions on the #data slack channel, or open a new issue detailing what you’re trying to do.
so it seems a good idea to actually open an issue to ask about the best way to serialize Julia Enums with Arrow (just in case the above is missing something), and to also ask for better clarification of this part of the documentation.
Thank you, this is exactly what I was looking for.
Yes, 100% agree. I found the docs are quite dense. And it’s surprising that such a simple case like enums isn’t an example, or even come implemented by default.
Yeah, we could probably add default support for Enums, which would basically be the solution by @digital_carver, but for any Enum subtype. If someone is up for making a PR with a couple of tests, I’d appreciate it! Otherwise, if someone wants to open an issue, I can try to get to it soon.