Is constructing an array on the fly on this line costing me a lot of memory?

SZJX · November 7, 2018, 8:18pm

One of my functions takes an array of String as an argument. According to my benchmarking results it apparently incurs very high memory usage, sometimes even the highest throughout my whole program under certain conditions:

8026258929 trigram_prob = prob(npylm, [string_rep_potential_context1, string_rep_potential_context2], string_rep_potential_word)

I wonder what this line exactly means. Does it suggest that the operations regarding this line alone (i.e. excluding any operations that happened within the prob method) cost 8GB of memory? Or does this number include operations further down the call stack?

Could the memory usage have been caused by the way to construct the array with [string_rep_potential_context1, string_rep_potential_context2] and there’s actually a better way to do it? Or is this syntax OK and thus the problem probably lies elsewhere?

kristoffer.carlsson · November 7, 2018, 8:24pm

Yes, creating an Array over and over might be a performance problem due to the allocations needed. So fix that and benchmark again.

SZJX · November 7, 2018, 8:31pm

Thanks. So no matter how I construct this array, there will always be a lot of memory overhead? Then I might need to think of a way to refactor my code so that I avoid constructing arrays on the fly as much as possible.

Just to confirm, each number on a line refers to the memory cost of this line alone right? So when I see 0 in a lot of places, that really means that line incurred zero memory allocation, and if I see a very large number on a line, that means that line alone (irrelevant of what happens inside of the function invoked in that line) resulted in a huge amount of memory allocation?

I just wonder why on some lines there is a 0 while on some other lines there’s nothing prepended at all.

StefanKarpinski · November 7, 2018, 8:42pm

If it’s always a two-element array, you could try using a tuple. The compiler can eliminate allocations for tuples much more easily than arrays.

SZJX · November 7, 2018, 9:15pm

Thanks. That sounds sensible.

y4lu · November 8, 2018, 3:36am

You might be able to get a bit more info by putting it on it’s own line

string_reps = [string_rep_potential_context1, string_rep_potential_context2]
trigram_prob = prob(npylm, string_reps, string_rep_potential_word)

I think Ref() and [] to dereference might help, since immutables (like strings?) aren’t passed by reference?
But it doesn’t seem to be the case

sizeof(test) - 44627
typeof(test) - String
@time [test, test] - 5 alloc, 256 bytes
@time [Ref(test), Ref(test)] - 7 alloc, 288 bytes

stevengj · November 8, 2018, 12:59pm

No, that’s not how argument-passing works.

y4lu · November 8, 2018, 1:22pm

So it is just scalars that are copied, everything else is by reference
I still don’t quite grasp why @time [test, test] didn’t allocate 89000 or so bytes though

spoiler: [test * "A", test * "B"]

stevengj · November 8, 2018, 1:52pm

Whether an immutable is internally copied in memory (or whether it exists in main memory at all, as opposed to being stuffed in a register etc.) is basically up to the compiler. Generally, large immutable objects will not be copied.

foobar_lv2 · November 8, 2018, 1:54pm

Things that are isbitstype are passed by-value, everything else is a pointer. [test test] stores two identical pointers for non-bitstypes and otherwise two mostly identical copies of sizeof(typeof(test)) many bytes. I say “mostly identical” because padding bytes may differ; they have formally undefined contents and practically contain register-vommit or leftover heap contents.

But “passed-by-value” does not really mean “passed-by-value” either; depending on types, context, inlining, optimization and julia’s calling convention this may end up getting passed as a pointer, on the stack or in a register. And depending on your CPU, even stack copies might never hit main memory or even L1 anyway (but your system will retcon a state where it hit main memory if you look for it).

Topic		Replies	Views
Avoiding allocations of small but non-trivial arrays (work array alternative?) Performance question	38	4354	November 17, 2022
Array of functions - is there a way to avoid allocations performance penalty? Performance memory-allocation , arrays	23	2337	May 23, 2019
Memory on array element assignment Performance	10	429	August 3, 2022
A question about how arrays work, how memory is allocated and what happen when chunks of code inside a function are moved into another function Performance	10	376	May 13, 2022
Static array or tuple allocate at construction Performance memory-allocation , staticarrays , ntuple	2	244	March 11, 2024

Is constructing an array on the fly on this line costing me a lot of memory?

Related topics