Help with increasing performance in badly coded functions

Hi guys, I need a little help with optimising the performance of a few functions. If you need a more detailed access to my project here’s a notebook (Largrids.jl/largrid.ipynb at main · adelloste/Largrids.jl · GitHub) that contains all the functions with inputs ready to be tested with a @btime macro. Here they are.

cart_opt executes the cartesian product of a certain number of arguments:

function cart_opt(args)::Vector{Tuple}
    return sort(vcat(collect(Iterators.product(args...))...))
end

This is the output:

cart_opt([[1,2],["a"],[3,4]])
4-element Vector{Tuple}:
 (1, "a", 3)
 (1, "a", 4)
 (2, "a", 3)
 (2, "a", 4)

This one (index2addr_opt) I’m not quite sure what it does but if you have any suggestions I’ll happily take them.

@inline function index2addr_opt( shape::Vector{Int64} )
    n = length(shape)
    theShape = push!(shape[2:end],1)
    weights = [prod(theShape[k:end]) for k in (1:n)]

    function index2addr0_opt( multiIndex::Vector{Int} )::Int
        return dot(collect(multiIndex), weights) + 1
    end

    return index2addr0_opt
end

Here’s the output:

index2addr_opt([2,7])([1,4])

12

My project is about solid modeling, larCellProd_opt is a bit of a mess and it executes the product between two larCells

@inline function larCellProd_opt(cellLists::Vector{Lar.Cells})::Lar.Cells
    shapes = [length(item) for item in cellLists]
    subscripts = cart_opt([collect(range(0, length=shape)) for shape in shapes])
    indices = [collect(tuple) .+ 1 for tuple in subscripts]
 
    jointCells = [cart_opt([cells[k] for (k,cells) in zip(index,cellLists)]) for index in indices]
    convertIt = index2addr_opt([ (length(cellLists[k][1]) > 1) ? shape .+ 1 : shape for (k,shape) in enumerate(shapes) ])
    [map(convertIt, map(collect,jointCells[j])) for j in 1:length(jointCells)]
end

Here’s the output:

c1 = [[0,1],[1,2],[2,3]]
c0 = [[0],[1],[2]]
@btime larCellProd_opt([c1, c1, c0])

27-element Vector{Vector{Int64}}:
 [1, 4, 13, 16]
 [2, 5, 14, 17]
 [3, 6, 15, 18]
 [4, 7, 16, 19]
 [5, 8, 17, 20]
 [6, 9, 18, 21]
 [7, 10, 19, 22]
 [8, 11, 20, 23]
 [9, 12, 21, 24]
 [13, 16, 25, 28]
 [14, 17, 26, 29]
 [15, 18, 27, 30]
 [16, 19, 28, 31]
 ⋮
 [19, 22, 31, 34]
 [20, 23, 32, 35]
 [21, 24, 33, 36]
 [25, 28, 37, 40]
 [26, 29, 38, 41]
 [27, 30, 39, 42]
 [28, 31, 40, 43]
 [29, 32, 41, 44]
 [30, 33, 42, 45]
 [31, 34, 43, 46]
 [32, 35, 44, 47]
 [33, 36, 45, 48]

The last function I’d need help with is createCells, I’ve been told that push! isn’t really good for performance.

@inline function createCells(V,cells1,W,cells2,vertices)
    cells = []
    for c1 in cells1
        for c2 in cells2
            cell = []
            @inbounds for vc in c1
                @inbounds @simd for wc in c2
                    push!(cell, vertices[[V[:,vc];W[:,wc]]] )
                end
            end
            push!(cells, cell)
        end
    end
    return cells
end

in general, it isn’t worth parallelizing bad code. adding threads will give you a maximum 100x speedup (on a fairly large server), and it’s usually harder than making bad code 100x faster single threaded. step 1 is removing unnecessary allocations.

Splatting is bad and collect is bad. But first of all get rid of the output type annotation. You are forcing your vector to have an abstract eltype.

Annotating functions with output types is normally not a good idea.

2 Likes

A good next step is to get rid of every single instance of collect in all your functions. See if anything breaks. Check for ways to fix it without using collect.

Also, cell = [] is a red flag for performance. Try to use typed containers, preferably by letting the compiler figure out the types. But a plain [] forces the eltype to be Any, which hurts performance.

BTW, what is Lar.Cells? It looks like the code is using regular Arrays of Arrays.

2 Likes

Woud you have any suggestion on how to improve the code?

Lar.cell is a type coming from the LinearAlgebraicRepresentation.jl pkg. It’s used to define cells in solid modeling. Btw you were very helpful thx a lot, I will try.