Here is a Lisp-y or Scheme-y, if you prefer, way to deal with deeply nested dicts in Julia. Don’t ask why, but I have an application with deeply nested data structures. I serialize them to and from YAML, which isn’t a markup language I’m told. I like to use symbols as keys in Julia, but YAML doesn’t like map keys that begin with colons (or anything else for that matter).
The YAML package in Julia makes it easy to set the key type on loading, but struggles with writing deeply nested structures, especially when the value is a struct. So, I found myself having to convert the key type of dicts back and forth between strings and symbols for serializing/de-serializing. I was going slightly crazy writing loops for the various data structures, trying to use semantic naming for the keys and values instead of k,v. And when I changed something I had to delve into my mess. Each data structure had it’s own load and write function, which was a lot of messy case-specific code that was horrible to maintain.
Well, this summer I went through much of the Paul Graham ANSI Common Lisp book and Brian Harvey’s Simply Scheme book to get my head around the languages and decide if I was a CL or a Scheme guy. I’m neither. There are good things to learn from either and performance is surprisingly good, especially compared to Python. But, neither can touch Julia for numerical work.
That exercise made me realize that I could solve my nested data structure problem without worrying about the exact structure of my messes: it doesn’t matter what the semantics of the keys are; it only matters if its value is another dict. And it doesn’t matter how deep or ragged the mess is. (I still have to deal with the structs and values that are “flow” vectors… …but this doesn’t matter for switching keys between strings and symbols.)
So, here is the code:
function dict_key_to_symbol(d)
Dict(Symbol(k)=>
(!(typeof(v) <: AbstractDict) ? v : dict_key_to_symbol(v))
for (k,v) in d)
end
function dict_key_to_symbol_v2(d)
Dict(Symbol(k) =>
if !(typeof(v) <: AbstractDict)
v
else
dict_key_to_symbol_v2(v)
end
for (k, v) in d)
end
function dict_key_to_string_v2(d)
Dict(string(k) =>
if !(typeof(v) <: AbstractDict)
v
else
dict_key_to_string_v2(v)
end
for (k, v) in d)
end
I don’t think we have car, cdr to walk through each dict’s keys so we need some kind of looping rather than 2 levels of recursive calls (1 to go ‘across’ the keys at the top level of each dict; another to ‘go down’ the nested dicts). The dict comprehension version with the ternary operator is great for people who like short code: you can make it one line if you like. I think the if-assignment version is much easier to understand.
Here is a test case:
# a somewhat deeply nested, ragged dict
d3lvl = Dict("l1_a"=>
Dict("l2_a"=>
Dict("l3_a"=>5)),
"l1_b"=>4.0,
"l1_c"=>
Dict("l2_a"=>
Dict("l3_a"=>5.0, "l3_b"=>7.0)))
Here is running it and round-tripping:
julia> dict_key_to_symbol_v2(d3lvl)
Dict{Symbol, Any} with 3 entries:
:l1_b => 4.0
:l1_a => Dict(:l2_a=>Dict(:l3_a=>5))
:l1_c => Dict(:l2_a=>Dict(:l3_b=>7.0, :l3_a=>5.0))
julia> dict_key_to_string_v2(dict_key_to_symbol_v2(d3lvl))
Dict{String, Any} with 3 entries:
"l1_a" => Dict("l2_a"=>Dict("l3_a"=>5))
"l1_c" => Dict("l2_a"=>Dict("l3_a"=>5.0, "l3_b"=>7.0))
"l1_b" => 4.0
This was fun and useful.
Anyone want to try replacing the loop across keys with recursion? I thought about and didn’t really try. Somehow we need an iterator that provides next key
; doesn’t blow up at the end; instead returning something like the null that Lisp or Scheme would return. Could use get(dict, ???, nothing) and test for nothing–how would you do “get-next”?