Memory allocation problem


#1

I profiled by my program by tracking user memory allocation. The program seems to incur memory allocation at places I don’t expect.

My program is pretty long so I have it updated to Github: https://github.com/jinliangwei/julia_mem/blob/master/serial_lda.jl

There is also the memory tracking output and a sample dataset.

The function that I am trying to optimize is sample_one_word which is currently the bottleneck of my program.

Mainly I don’t understand why memory allocations happen when the program reads and writes a single element of a Vector, for example, see line 178, line 272, line 278, etc.

This program is about 2~3X slower than a C++ implementation and I am suspecting memory allocation is one main bottleneck.


#2

This naively looks like a type stability issue. Can you update your code to generate random test-data first, so that we can copy-paste it into the REPL?


#3

Have you precompiled your code first? That line 178 seems like compilation allocation. Note that the allocation printed some times is off a line…


#4

Thanks! I’ve added a program that generates random input: https://github.com/jinliangwei/julia_mem/blob/master/serial_lda_random.jl


#5

Thanks for your answer!

No, I didn’t precompile my code. The sample_one_word function is executed hundreds of thousands of times, but it’s JIT compiled just one. It would be pretty shocking that compiling one function allocates 480MB of memory, right?


#6

Looking at @code_warntype, I see that new_topic in sample_all_words is not inferred and sample_one_word has an Int32/Int64 instability.


#7

Thank you! This is very helpful!