Reading Python datastructure (from text file) in Julia

Hi

I have a file with a nested data structure, coming from a Python tool, that I would like to read in Julia. It’s basically a list with each item of the list being a dict which in its turn contains key/value-pairs of a variable name and a list containing floating point numbers (real and complex). Example:

[
{
“origin”: “Jupiter”,
“data_x”: [1.0, 2.0, 3.0],
“data_phi”: [2+3j, 3-1j, 1.2-3.4j]
}
,
{ similar thing}
,
{and final thing}
]

The lists are not all of the same length (some data vectors are a single value but still in list notation [3.4]). Some items in the main list will have more or less key/value pairs in the dict.

First there is the different notation for complex numbers, but that isn’t too much of a problem. I can do some vim magic on the file (the conversion is a one-off thing, so some manual labour is not too much of an issue).

The major problem seems to be that JSON doesn’t know how to handle complex numbers.

What would be the easiest way to read in such a file and have the content available in a variable (the equivalent of mydata=eval(datastring) in Python? And what do I need to modify in the text file to make it Julia readable?

Thanks
Gert

Read the file as text. Use a regex to replace the complex numbers with an appropriate json object. You can do this in Julia, no need to do it manually in a text editor. Parse the result with JSON3 into structs that match structure of the data. You need to convert those (arrays of) json objects to Complex instances. You can probably do this with a constructor for your struct that can be invoked by JSON3.

Yes, unfortunately your Python tool is using a nonstandard extension here (the JSON standard doesn’t include a complex-number format).

One option, if you can modify the Python tool, would be to store the complex numbers in a more standard format, e.g. as strings (which can then be parsed in Julia), or as separate real/imaginary arrays.

Another option is to use PyCall or PythonCall to call Python from Julia, use that to read the file, and at which point it is easy to convert it to Julia-native data types as desired.

A third option is to hack your own parser (e.g. using regexes as suggested above), but this gets complicated quickly if your datafile is complicated.

A fourth option is to patch one of the Julia JSON parsers to read complex numbers.

I suppose if your vim magic can find all those array with ill-format number and encode the whole thing into a string, then you don’t have to mess around with internal of a JSON parser.

julia> raw_string = """
       [
       {
       "origin": "Jupiter",
       "data_x": [1.0, 2.0, 3.0],
       "data_phi": "[2+3j, 3-1j, 1.2-3.4j]"
       }]
       """
"[\n{\n\"origin\": \"Jupiter\",\n\"data_x\": [1.0, 2.0, 3.0],\n\"data_phi\": \"[2+3j, 3-1j, 1.2-3.4j]\"\n}]\n"

julia> using JSON

julia> d = JSON.Parser.parse(raw_string)
1-element Vector{Any}:
 Dict{String, Any}("data_x" => Any[1.0, 2.0, 3.0], "data_phi" => "[2+3j, 3-1j, 1.2-3.4j]", "origin" => "Jupiter")

julia> const j =im
im

julia> first(d)["data_phi"] |> x->eval(Meta.parse(x))
3-element Vector{ComplexF64}:
 2.0 + 3.0im
 3.0 - 1.0im
 1.2 - 3.4im