Hi,
The following overly-simplified extraction from my code shows a section of the code which is allocating memory at every loop iteration.
I think this should not happen and I’d like to have advise and suggestion on how to optimize/correct it.
cache_unique_edges = array_cache(conn_unique_edges) # allocation done here to prevent further allocations later.
for iel = 1:nelem
for e = 1:E
...
for g = 1:G
...
ai = getindex!(cache_unique_edges, conn_unique_edges, g)
if (ai[2] == ai[1])
# THIS if STATEMENT CAUSES OVER-ALLOCATION but I want to avoid it!
end
end
end
end
When the if statement is commented out, the allocation and timing are: [ Info: 17.088092 seconds (37.64 k allocations: 2.114 MiB)
However when the the code executes if (ai[2] == ai[1]), then allocation and timing are: [ Info: 35.950601 seconds (676.71 M allocations: 10.085 GiB, 1.46% gc time)
NOTE on getindex! and array_cache: these are functions that use my own definition of tables. This being said, the same exact behavior is observed if I use julia native Arrays.
Without knowing what ai ultimately is, it’s quite impossible to diagnose from afar. Do you have a minimal, self contained example people could run to debug on their machine?
hi @Sukera thanks for replying. I am extracting a working code for you to test. Because it is part of a major code that I am developing, you will need to run it from within its own --project=. and add a couple libraries.
I hope that is ok. I’ll post a github link shortly
It is not. This being said, reducing the code to a minimal working code for this forum seems to have helped found the culprit. More soon. Still assessing this statement.
I don’t know where array_cache comes from, but my guess is that the function allocates a new array internally?
This creates two new arrays per iteration, together with the surrounding loops that’s a total of NLOCAL * NEL * NGLOBAL * 2 allocations. Either use a tuple (so (ai[1], ai[2]) etc) or write the comparison exiplicitly.
I do not have a local MPI setup, so I can’t really run your code sorry. I also don’t know where getindex! is coming from - github code search does not show any hits in your repository, so I don’t know what it’s type would be and thus can’t really figure out which getindex method on ai would be called. However, since you report the same behavior with standard arrays, I’m assuming the getindex itself does not allocate.