Why does recursive dereferencing occur during printing of a tree?

Hi!
With my bidirectional tree (see below) I fell into the “circular reference” trap which makes me wonder why Julia (or something inside Julia?) shows this behavior in the first place.
Coming from C I would expect the visualization of “parent” and “children” fields to be something like “ref to TreeNode” (just like “#undef”) instead of the recursive dereferencing that actually occurs.
To me it looks like an attempt of the designers of the language to be nice to me and show me what kind of data I will find when I traverse all the way through to the bottom of my reference chain. But actually it prevents the printout from working for sufficiently large datasets like in my case (not the example below).
Can anyone explain what the rational is behind that?
And is there a way to influence this behavior?
I read about overloading the different print and output functions but to me this is kind of the wrong end to start at. Besides my short try to do that failed anyway because I don’t know enough about the internal mechanics of printing so far and don’t know what to do exactly. (And to be honest: I don’t want to be forced to…)

mutable struct TreeNode
    weight::Integer
    parent::TreeNode
    children::Vector{TreeNode}
    TreeNode(w) = new(w)
    TreeNode() = new()
end

tree = Dict{String,TreeNode}()

tree["test"] = TreeNode()
tree["test2"] = TreeNode(20)
tree["test3"] = TreeNode(30)

tree["test"].weight = 10
tree["test2"].parent = tree["test"]
tree["test"].children = [tree["test2"]]
tree["test3"].parent = tree["test2"]

display(tree)
julia> display(tree)
Dict{String, TreeNode} with 3 entries:
  "test2" => TreeNode(20, TreeNode(10, #undef, TreeNode[TreeNode(#= circular reference @-3 =#)]), #undef)
  "test"  => TreeNode(10, #undef, TreeNode[TreeNode(20, TreeNode(#= circular reference @-3 =#), #undef)])
  "test3" => TreeNode(30, TreeNode(20, TreeNode(10, #undef, TreeNode[TreeNode(#= circular reference @-3 =#)]), #undef),…

This seems pretty straightfoward to me, and I don’t really understand what you mean by “recursive dreferencing”. Could you write out an example of how this output would look as you expected?

OK, what I would like to have is something like:

"test"  => TreeNode(10, #undef, TreeNode[#refto(TreeNode)])
"test3" => TreeNode(30, #refto(TreeNode), #undef)

That should result in a representation of only the object I’m referring to without showing all the subsequently referenced objects.
I made something up to get my idea across.
In C this would be the address to which pointer “parent” points to.
an addional difficulty now becomes that one would like to know which #refto it is so some king oh UID for each #refto would be needed…
To be clear: As long as there are no circular references everything is fine as it is.
And the problem also only occurs with large amounts of data to be printed at once.

The reason the entire object and all its children is shown is that there is (semantically) no reference to stop at, and the default printing methods this all falls back on have no concept of stopping there.

Put another way - how the objects are stored in the children Vector has no bearing on whether or not they are shown by default. The default is just showing the elements of the vector, which then tries to show the children, which shows the TreeNode again and so on, until you reach a leaf in your tree that doesn’t have any children anymore.

Put yet another way - if you’d expect a pointer-like printing, you’d have to use a a MyRef{Vector{TreeNode}} that references that vector without printing it out when shown. Something like this:

julia> struct MyRef{T}
           v::T
       end

julia> Base.show(io::IO, ::MIME"text/plain", mr::MyRef{T}) where T = print(io, "#refto(", T, ")")

julia> MyRef(Int[])
#refto(Vector{Int64})

julia> MyRef(Int[1,2,3])
#refto(Vector{Int64})

In Julia, there is a very clear difference between an object and referencing that object - the concept of “dereferencing” doesn’t really exist like it does in C on the semantic level; you either have direct access to an object, or you don’t.

5 Likes

Thanks a lot!

Worth knowing that you can define custom pretty printing to get a short form for TreeNodes when they’re nested in a larger data structure. Something like this:

function Base.show(io::IO, tree::TreeNode)
    if get(io, :compact, false)
        print(io, "TreeNode(", tree.weight, ")")
    else
        invoke(Base.show, Tuple{IO,Any}, io, tree)
    end
end

Gives this:

julia> tree["test2"]
TreeNode(20, TreeNode(10, #undef, TreeNode[TreeNode(#= circular reference @-3 =#)]), #undef)

julia> tree
Dict{String, TreeNode} with 3 entries:
  "test2" => TreeNode(20)
  "test"  => TreeNode(10)
  "test3" => TreeNode(30)

This falls back to the default printer when the IOContext isn’t :compact, which is what you want, for reasons the docs explain. If you want special printing at the REPL for a full TreeNode, define the three argument form Base.show(io::IO, ::MIME"text/plain", tree::TreeNode).

Thanks!
That’s closer to what I imagined. My hope was to get around this glue code but I now understand that this is not how dynamic languages work in general.
I still have to get my mind around the differences between Julia being a dynamic language and static languages I was raised with decades ago. :wink:

1 Like

In this case it’s not exactly about dynamic vs static, rather languages starting to abstract pointers away. There are definitely pointers involved in the implementation, for any language (which is likely written in C at some point), and we’re referring to those when we talk of “dereferencing.” But on the language level, we’re just indexing A[...] or getting properties B.blah, and neither need to involve dereferencing at all.

2 Likes

This is not related to how dynamic or static a language is at all though :thinking: That’s just the way default printing is defined, nothing more. The printing is recursive by default.

I think the difference is that dynamic languages tend to have a default printing mechanism and a way to override it. Static languages often don’t have the technical capability of providing generic default printing for all types and if they do it’s often quite limited and unlikely to be recursive.

2 Likes

Sure, but that’s not inherent to static or dynamic typing. You can in principle define recursive printing through the use of reflection in static languages too, as long as you have a closed world of types.

You’re welcome! I’d encourage you to think of it as customization, rather than glue code. Julia’s type system lets it provide a sensible generic fallback for things like reprs/printing, while the multiple dispatch on methods makes it easy to add a method to show which will do whatever you’d like it to.

Julia’s dynamic nature is very much a part of why there’s usually no need for a hard distinction between a value and a pointer to that value. There is a Ref type for when you really need it, but generally, Julia’s compiler is free to decide when it puts structs on the heap (such that they’re passed by reference) or if they’re immutable, it can leave them on the stack and pass by value. It’s nice :slight_smile: .

1 Like

That presumes a lot of stuff that isn’t that common in mainstream static languages. You can’t define default printing in C or C++ for example. Even in the academic static languages, you can in some but not most. In OCaml you can’t define generic printing; in Haskell you can do it but it requires the relatively advanced feature of type classes.

2 Likes