I had a program that was running extremely slowly so I ran memory benchmark on it. Apparently one particular line of code accumulated a whopping 27GB of memory allocation…
(lgamma(theta(pyp)) - lgamma(theta(pyp) + pyp.crp.totalcustomers) +
lgamma(theta(pyp) / d(pyp) + pyp.crp.ntablegroups) -
lgamma(theta(pyp) / d(pyp)) +
pyp.crp.ntablegroups * (log(d(pyp)) - lgamma(1 - d(pyp))) +
sum(map(c -> lgamma(c - d(pyp)), Iterators.flatten(values(pyp.crp.tablegroups))))
)
I suspected it’s the part
sum(map(c -> lgamma(c - d(pyp)), Iterators.flatten(values(pyp.crp.tablegroups))))
that caused the problem. So I rewrote it as a simple loop:
temp::Float64 = 0.0
for tablegroup in values(pyp.crp.tablegroups)
for table_customer_count in tablegroup
temp += lgamma(table_customer_count - d(pyp))
end
end
(lgamma(theta(pyp)) - lgamma(theta(pyp) + pyp.crp.totalcustomers) +
lgamma(theta(pyp) / d(pyp) + pyp.crp.ntablegroups) -
lgamma(theta(pyp) / d(pyp)) +
pyp.crp.ntablegroups * (log(d(pyp)) - lgamma(1 - d(pyp))) +
temp
)
and in a test run on simple input data, the memory usage by that part was reduced 10-fold, from nearly 1GB (far exceeding any other part of the program) to only about 100MB.
One huge advantage of Julia is how easy it is to use FP patterns in it. However I didn’t expect such a huge performance hit. Maybe I didn’t write the code in an optimal way? Could this memory issue be improved in some other way, e.g. with more type annotations?
Or maybe there are some other problems with my program which were somehow dealt with by this change of code?