So I have a python script that produces a vector of float values. I wish to save this vector of values and then open it in Julia.
Is there some serialization (for float vectors) which is both julia and python compatible, so that I could do:
my_vector = [1.0, 2.0, 3.0]
my_vector = someserializer_deserialize("my_file.pjls")
Yes, at least the Arrow format, JSON and HDF5. JLD the Julia serialization is based on HDF5, but I’m not sure if makes HDF5 ideal. JSON is very common (and Amazon Ion superset of it seems nice), but text only, thus slower. Ion is binary (or text), but hasn’t been implemented as far as I know.
Is there a better alternative than Arrow? It has excellent Julia support.
Using JSON seems to work, thanks a lot
Yes, JSON is enough for vector of values, i.e. arrays from Python which are only 1D.
If you need 2D (or more dims) arrays, i.e. a matrix (or a tensor) then native Python doesn’t support that so possibly you’re not looking for that.
What you then do in Python (and e.g. Java) to emulate that is use jagged arrays, i.e. arrays of arrays. You can probably serialize that (but that’s not the best format for matrices, and I’ve not looked into deserializing that in Julia). What you would likely do is rather use Python’s NumPy and serialize:
I didn’t look into if you usefully can deserialize that either. HDF5 was made for matrices/science, and Arrow support too, and if you need them sparse, at least Arrow has that, and JSON might be a headache.
If you simply want to get data across, I would suggest calling from Python to Julia (or vice versa) and simply sending data over. You can do that with PythonCall.jl (both directions), or pyjulia/PyCall.jl (which support NumPy, if I recall the newer PythonCall does to).