How to retrieve the edge list from a Dict reprepresenting a tree data structure

I’m reading a tree data structure fron a json file, and I need to retrieve the vector of
edge tuples, (parent, child), to define a LightGraphs.SimpleDiGraph and get the node positions via NetworkLayout.Buchheim layout

An old Python function, that still works and I’d like to convert to a Julia one, is defined by:

import itertools
def edgelist(d, counter = next(itertools.count())):
    #d is the dictionary read from  the jsonfile
   
    d["name"] = counter() if d["name"] is None else d["name"]
   
    if "children" in d:
        for child in d["children"]:      
            if child:
                child["name"] = counter() if child["name"] is None else child["name"]
                yield (d["name"], child["name"])
                
        for child in d["children"]:
            if child:
                for node in edgelist(child, counter):
                    yield node 

Since there isn’t an equivalent for next to get the next element in Julia Iterators.countfrom(), only using for..., as well as no yield correspondent, I cannot figure out how could I get the tree edges in a Julian way.

LE. An example of tree_dict:

dtree = Dict("name"=> "A",
        "children"=>[Dict("name"=>"B",
                     "children"=> [Dict("name"=>"B1", 
                                   "children"=>[Dict("name"=> "B11",
                                                "children"=> [Dict("name"=> "B111")]),
                                               Dict("name"=>"B12"), 
                                               Dict("name"=>"B13")]),
                                  Dict("name"=> "B2",
                                   "children"=> [Dict("name"=>"B21"),
                                                Dict("name"=>"B22")])]
                    ),
                    Dict("name"=>"C",
                     "children"=> [Dict("name"=>"C1", 
                                   "children"=>[Dict("name"=> "C11"), 
                                               Dict("name"=>"C12", "children"=> [Dict("name"=> "C121")]), 
                                               Dict("name"=>"C13")]),
                                  Dict("name"=> "C2",
                                   "children"=> [Dict("name"=>"C21"),
                                                Dict("name"=>"C22")])]
                    )
                 ]);

As far as I understand, the Python version in essence defines the iteration over the edges.
If you’re OK with eager evaluation, just define edges = [] at the top of the function and replace yield x by push!(edges, x).

For a replacement of itertools.count, you may create a custom callable struct:

mutable struct Counter{T<:Integer}
    start::T
end

Counter() = Counter(0)

function (cnt::Counter)()
    current = cnt.start
    cnt.start += 1
    return current
end

An unreadable one-liner:

edgelist(d) = vcat(([(d["name"], child["name"]); edgelist(child)...] for child in get(d, "children", ()))...)

More readable:

function edgelist(d; list=NTuple{2,String}[])
    for child in get(d, "children", ())
        push!(list, (d["name"], child["name"]))
        edgelist(child; list)
    end
    return list
end

(you might want to add checks for edge cases as in the Python version)

3 Likes

@sudete Thank you very much!! I like this one :slight_smile:

edgelist(d) = vcat(([(d["name"], child["name"]); edgelist(child)...] for child in get(d, "children", ()))...)

Here are some nicer ones I think :slight_smile:

An iterator over the edges (call collect on the result to get a vector):

using .Iterators: flatten

edges(d) = flatten(flatten([((d["name"], c["name"]),), edges(c)]) for c in get(d, "children", ()))

Splitting the problem in independent parts:

children(n) = get(n, "children", Dict{String,Any}())

# An iterator over the nodes (each node is a Dict)
nodes(d) = flatten([(d,), flatten(nodes(c) for c in children(d))])

# Get the edges of each node and concatenate them
mapreduce(n->[(n["name"], c["name"]) for c in children(n)], vcat, nodes(dtree))

I really appreciated your first solution and smiled because I was excited it was a Julian one. So there was no need for a detailed version. But thank you for it.

Ah note that the detailed solution was not meant as an explanation, I wanted to show it because I think it’s the best solution: more readable, and more useful since the children and nodes functions can also be used to solve other tasks.

1 Like