Performance issue: Axelrod model of dissemination of culture optimization

Hello! I’m implementing an Agent Based Model (axelrod model of dissemination of culture). Every node has an associated vector of “opinions” (data) on different subjects (number of features is feat). Each step two nodes are chosen random (I get a random node and then a random neighbor using the neighs adjacency list which is a vector of vector where neighs[j] is the vector of all neighbors of node j.

The nodes are compared and a vector of Bool which is true when two nodes disagree on a subject is called disagreements and is created with the compare! function. Along with that feats - count(disagreements) is returned which is the number of overlapping opinions.

Two nodes interact if they have at least 1 and not all opinions in common with probability equal to the number of overlapping opinions. The interaction consist of setting equal one of the opinions where they differ.

Every nodes steps all pair of nodes are tested and the number of active links (that is, the links that have at least 1 but not all opinions in common) is returned.

function simulate_system!(
    data::Vector{Vector{T}},
    feats::Int,
    maxtime::Int,
    storage::Vector{T},
    edges::Vector{Tuple{U,U}},
    neighs::Vector{Vector{U}},
    nodes::Int
) where {T<:Integer,U<:Integer}
    disagreements::BitVector = BitVector(undef, feats)
    for t in 1:maxtime
        if t % nodes == 0
            num_active::Int = get_active(data, feats, edges)
            setindex!(storage, num_active, div(t, nodes))
            if num_active == 0
                println("Absorbing state reached in $t")
                break
            end
        end
        node_1 = rand(1:nodes)
        node_2 = rand(neighs[node_1])
        op_1 = data[node_1]
        op_2 = data[node_2]
        overlaps::Int = compare!(disagreements, op_1, op_2, feats)
        if overlaps != feats && overlaps != 0
            if rand() < overlaps / feats
                interact!(disagreements, op_1, op_2)
            end
        end
    end
end
Here I have all the functions used in the code if you want to test it out
function get_overlaps(data_1::Vector{T}, data_2::Vector{T}) where {T<:Integer}
    overlaps::Int = 0
    for (el_1, el_2) in zip(data_1, data_2)
        if el_1 == el_2
            overlaps += 1
        end
    end
    return overlaps
end

function get_active(
    data::Vector{Vector{T}},
    feats::Int,
    edges::Vector{Tuple{U,U}}
) where {T<:Integer,U<:Integer}
    num_active::Int = 0
    for (node_1, node_2) in edges
        overlaps::Int = get_overlaps(data[node_1], data[node_2])
        if overlaps != 0 && overlaps != feats
            num_active += 1
        end
    end
    return num_active
end


function compare!(
    diffs::Union{BitVector,Vector{Bool}},
    data_1::Vector{T},
    data_2::Vector{T},
    feats::Int
) where {T<:Integer}
    @inbounds @simd for idx in 1:feats
        diffs[idx] = data_1[idx] != data_2[idx]
    end
    return feats - count(diffs)
end

function interact!(
    diffs::Union{BitVector,Vector{Bool}},
    data_1::Vector{T},
    data_2::Vector{T}
) where {T<:Integer}
    idx_diff::Vector{Int} = findall(diffs)
    to_agree::Int = rand(idx_diff)
    data_1[to_agree] = data_2[to_agree]
end

Data is initialized as random with a Poisson distribution and a 150x150 grid. These are the parameters I used

using Graphs
using Graphs.SimpleGraphs: adj
using Distributions: Poisson

side::UInt32 = 150
graph = grid((side, side), periodic=false)
n_nodes::Int = nv(graph)
edgelist = Vector{Tuple{Int,Int}}(Tuple.(edges(graph)))
adjlist = Vector{Vector{Int}}(adj(graph))

features = 10
traits = 100
time_steps = 1_000 * n_nodes

opinions = [rand(Poisson(traits), features) for _ = 1:n_nodes]
storage_edges::Vector{Int} = zeros(Int, time_steps ÷ n_nodes)

simulate_system!(
    opinions,
    features,
    time_steps,
    storage_edges,
    edgelist,
    adjlist,
    n_nodes
)

While trying to profile using VS cdoe profiler I get the following:

  1. 35% in: overlaps::Int = compare!(disagreements, op_1, op_2, feats)
  2. 18% in: num_active::Int = get_active(data, feats, edges)
  3. 16% in: node_2 = rand(neighs[node_1])
  4. 14% in: interact!(disagreements, op_1, op_2)
  5. 8% in: op_2 = data[node_2]

The first four seem fine and work as expected, most of the pairs will have none or all opinions in common so I expect the compare! and the choice of the nodes to be the most impactful functions.

I’m more interested in the 8% due to the second call data[node_2] this is weird also because the op_1 = data[node_1] cannot even be measured by the profiler. Why is calling a view of a vector the second time slower?

Also if you see anything that could be optimized (especially in the functions) I would be extremely happy because the computation times are very high. Thanks in advance!

This seems like it might be an artefact of sample-based profiling. Does this behavior persist if you re-run the profiler a second time? How about when you profile 10 runs of the function?

1 Like

I’ve tried again putting the simulation in a for loop this way

@profview for _ in 1:10 simulate_system!(
    opinions,
    features,
    time_steps,
    storage_edges,
    edgelist,
    adjlist,
    n_nodes
) end

But I still get the same result with a varying percentage from 8% to 10%.
Same result also when rerunning

Can you post a screenshot of the flame graph? My second hypothesis would be that node_1 is type-inferred correctly but not node_2, which would lead to runtime dispatch during indexing (a red tile).

1 Like


the fifth from the left is the getindex

Ok that’s not it, and I’m out of hypotheses without taking a closer look at your code ^^

1 Like

Thanks in advance for anybody checking out!

A copy and paste of my code
using Graphs
using Graphs.SimpleGraphs: adj
using Distributions: Poisson

side = 150
graph = grid((side, side), periodic=false)

n_nodes::Int = nv(graph)
edgelist = Vector{Tuple{Int,Int}}(Tuple.(edges(graph)))
adjlist = Vector{Vector{Int}}(adj(graph));

## imports and Graphs ##

function get_overlaps(data_1::Vector{T}, data_2::Vector{T}) where {T<:Integer}
    overlaps::Int = 0
    for (el_1, el_2) in zip(data_1, data_2)
        if el_1 == el_2
            overlaps += 1
        end
    end
    return overlaps
end

function get_active(
    data::Vector{Vector{T}},
    feats::Int,
    edges::Vector{Tuple{U,U}}
) where {T<:Integer,U<:Integer}
    num_active::Int = 0
    for (node_1, node_2) in edges
        overlaps::Int = get_overlaps(data[node_1], data[node_2])
        if overlaps != 0 && overlaps != feats
            num_active += 1
        end
    end
    return num_active
end

function compare!(
    diffs::Union{BitVector,Vector{Bool}},
    data_1::Vector{T},
    data_2::Vector{T},
    feats::Int
) where {T<:Integer}
    @inbounds @simd for idx in 1:feats
        diffs[idx] = data_1[idx] != data_2[idx]
    end
    return feats - count(diffs)
end

function interact!(
    diffs::Union{BitVector,Vector{Bool}},
    data_1::Vector{T},
    data_2::Vector{T}
) where {T<:Integer}
    idx_diff::Vector{Int} = findall(diffs)
    to_agree::Int = rand(idx_diff)
    data_1[to_agree] = data_2[to_agree]
end

function simulate_system!(
    data::Vector{Vector{T}},
    feats::Int,
    maxtime::Int,
    storage::Vector{T},
    edges::Vector{Tuple{U,U}},
    neighs::Vector{Vector{U}},
    nodes::Int
) where {T<:Integer,U<:Integer}
    disagreements::BitVector = BitVector(undef, feats)
    for t in 1:maxtime
        if t % nodes == 0
            num_active = get_active(data, feats, edges)
            storage[div(t, nodes)] = num_active
            if num_active == 0
                println("Absorbing state reached in $t")
                break
            end
        end
        node_1 = rand(1:nodes)
        node_2 = rand(neighs[node_1])
        op_1 = data[node_1]
        op_2 = data[node_2]
        overlaps = compare!(disagreements, op_1, op_2, feats)
        if overlaps != feats && overlaps != 0
            if rand() < overlaps / feats
                interact!(disagreements, op_1, op_2)
            end
        end
    end
end

## functions ##

features = 10
traits = 100
time_steps = 1_000 * n_nodes

opinions::Vector{Vector{Int}} =
    [rand(Poisson(traits), features) for _ = 1:n_nodes]
storage_edges::Vector{Int} = zeros(Int, time_steps ÷ n_nodes);

## initialization ##

@profview simulate_system!(
    opinions,
    features,
    time_steps,
    storage_edges,
    edgelist,
    adjlist,
    n_nodes
)