Making sorting a tree represented by vector of any types faster

dgsob · December 1, 2023, 3:27pm

I have an algorithm that generates a tree by multiplying two functions. It works, but my goal is to make it faster. With profiling I’ve found that a substantial amount of time is spend on sorting.

I’m wondering if the sorting would be faster if trees were represented by a data structure different than vectors of any types, and if so, how to represent a tree as such structure (I tried but failed so far and don’t even know if it’s worth the hassle ).

A part of the code that appears to be taking the most time (especially for larger inputs) with necessary minimal working additions and example input is as follows:

f1 = Vector{Vector{Any}}[[[0, Any[0, Any[0, [0, 0]]]]], [[100800, Any[0, Any[0, [0, 0]]]]], [[201600, Any[0, Any[0, [0, 0]]]]], [[302400, Any[0, Any[0, [0, 0]]]]], [[403200, Any[0, Any[0, [0, 0]]]]], [[504000, Any[0, Any[0, [0, 0]]]]], [[604800, Any[0, Any[0, [0, 0]]]]], [[705600, Any[0, Any[0, [0, 0]]]]], [[806400, Any[0, Any[0, [0, 0]]]]], [[907200, Any[0, Any[0, [0, 0]]]]], [[1008000, Any[0, Any[0, [0, 0]]]]], [[1108800, Any[0, Any[0, [0, 0]]]]], [[1209600, Any[0, Any[0, [0, 0]]]]], [[1310400, Any[0, Any[0, [0, 0]]]]], [[0, Any[1400700, Any[0, [0, 0]]]]]]
f2 = [0, 1201100]
M = 1402700

function is_leaf(t)
    return length(t) == 1
end

function sum_tree(t)
    return is_leaf(t) ? sum(t) : t[1] + sum_tree(t[2])
end

function multiply_gf(f1, f2)
    r = collect((vcat(j,i) for i in f1 for j in f2 if sum_tree(vcat(j,i)) <= M))
    sort!(r, by = sum_tree)
    println(r)
end

multiply_gf(f1, f2)

As you can see, f1 is Vector{Vector{Any}} just like the result r.

greatpet · December 7, 2023, 11:04pm

A tree is a recursive data structure. The Any type can handle recursive data types since it can handle any type, by definition, but at a performance cost. A type-stable definition could be:

struct BinaryTree{T}
    value::T
    left::Union{BinaryTree{T}, Nothing}
    right::Union{BinaryTree{T}, Nothing}
end

The left branch and right branch are allowed to be nothing if you’re already at the leaf.

Topic		Replies	Views
Efficiently traverse binary tree represented as tuple of tuples Performance memory-allocation , benchmark	9	1921	August 7, 2019
How to unroll tree traversal efficiently? General Usage metaprogramming , tree	1	1040	May 8, 2019
Data type for a “stupid” binary search tree Performance tree	4	675	July 13, 2021
How to sort a vector by an expensive function in place efficiently? Performance sorting	14	1717	March 8, 2022
A tree sort algorithm in Julia New to Julia	6	783	May 17, 2018

Making sorting a tree represented by vector of any types faster

Related topics