Why is heap memory "bad"?

I’ve been trying to understand some details of how certain programming languages work and certain paradigms such as immutability of variables. In particular I am trying to understand what stack/heap memory is and how it affects performance. Too often on this forum, I see creative solutions to problems in which the solutions dont “allocate”. i.e. the @time macro yields 0 allocations.

Don’t all variables take up memory anyways (I mean the data has to exist somewhere…) so why does @time not report all the memory used?

6 Likes

Think of the heap as a large room where objects are stored by placing them on the floor (no stacking).

When you bring in a new object (eg a sofa), the manager of this facility has to find a spot large enough for it — this is a computation that takes some time (it is very fast, but still to expensive when done repeatedly).

When you need your sofa back, there will be a sofa-shaped free space on the floor. The next item that is placed there may not be the exact same size, so floor space will be wasted. Objects have to be reshuffled occasionally to reclaim this wasted space. Also, in case you just announce that you no longer need your sofa, it will not be thrown away immediately, but by an intermittent process called garbage collection.

In contrast, the stack is simpler but less flexible. Think of a stack of boxes, of equal base area but different height. You can add to the top, and remove from the top. You cannot directly remove a box in the middle. It is very easy (=cheap) to keep track of free space, which is always contiguous.

These are of course oversimplifying metaphors, there are a lot more details, especially to garbage collection, but this is pretty much the gist.

52 Likes

To add to @Tamas_Papp’s excllent answer, I’ll further say that “allocations” summarized by @time report only heap-allocated memory. There is a good reason for this: you can think of the stack in the same way that you might thing of a Julia variable like

stack = sizehint!(Vector{UInt8}(undef, 0), 10^4)

Even though length(stack) == 0, it can grow to 10^4 without new allocations because space has already been reserved. (push!(stack, 0x01) and pop!(stack) just require updating your notion of the end of the vector, as long as that 10^4 limit doesn’t get exceeded.) The stack works the same way.

Don’t all variables take up memory anyways (I mean the data has to exist somewhere…)

The CPU also has registers, and small numeric variables often end up in them transiently. So yes, this is a form of storage, but again it’s so different from more generic forms of memory that it’s treated specially.

10 Likes

I don’t think heap allocation is “bad” - in most cases it is wanted and needed. First of all stack is usually limited to few megabytes and stack space is tightly assigned to a specific thread.

Consider such example

input1 = rand(1000000)
input2 = rand(1000000)

We need to calculate sum of squares of element-wise differences

function with_heap_allocations(input1, intput2)
    result = sum((input1 .- input2).^2)
end

function no_heap_allocation(input1, input2)
    result2 = 0.0
    for i in 1:length(input1)
        result2 += (input1[i] - input2[i])^2
    end
    return result2
end

Timing results

julia> @time with_heap_allocations(input1, input2)
  0.002924 seconds (13 allocations: 7.630 MiB)
166623.91955443815

julia> @time no_heap_allocation(input1, input2)
  0.001143 seconds (5 allocations: 176 bytes)
166623.91955444182

In the first example broadcast operator is doing in-memory projection of its results before passing it to sum so that is why you see few megabytes allocated.

The latter one is just iterating data that already are in the memory, so no significant allocations are seen.

Anyway, I don’t think the second implementation is better. It is much longer and harder to reason about. In most cases you should not care about those allocations too much, these days memory is cheap and garbage collectors are very sophisticated.

Also Julia is a bit different than general purpose languages with functional features (C#, Haskell, Java). For instance in Julia map function is doing in-memory projection of the result, but in other languages you might expect some “lazy” collection that could be further processed and the actual code is executed when you try to look into the final data without any allocations for intermediate results. In Julia I observed that iterating zip function result seems to have such laziness property, example:

sum((pair -> (pair[1]- pair[2])^2), zip(input1, input2))
2 Likes

@Tamas_Papp that’s great.

So when the compiler knows what variables are already defined in a function and can preallocate it before run-time, it can do so in the stack because the compiler can “efficiently pack the room”. On the other hand, if you are constructing a new array in a function, the compiler wouldn’t know how much space to reserve on the stack for this array, so it asks the OS to allocate at the time of construction. The OS now has to find memory and gives julia a “pointer” to where the memory is stored.

Similarly for mutable/immutable variables. When I have x = "string", the compiler can use stack memory for this. When i redefine x = "newstring" it dosn’t mean x is mutable, it’s simply just changing the binding of x. However, if I have something like x = Array{mytype}, it will likely use heap memory for this.

1 Like

Possibly, but I am not sure I want to make this metaphor too far.

I realized this when I started thinking about furniture referencing other furniture.

6 Likes

Right, heap memory isn’t bad, and stack memory definitely isn’t “good”. People quite readily like to (unintentionally) spread and believe these lies, but the reality is they have very little to do with good programming. A good programmer should be able to recognize a hammer is good at driving nails and a screwdriver at turning screws, and not reject them for being bad at the other task.

Those same folks will often try to also sell you an irrational fear of the GC being available in the language by default, and instead insist you must implement your own hand-rolled ref-counting GC while calling it a shared_pointer. Nevermind that you may actually end up with worse performance than the same code in another language.

The same could be said of many other tools too: runtime multiple dispatch, type parameterization, JIT compilation, lightweight tasks. Pretty much all languages support all of these, but some languages go out of their way to make them impractical to use, while Julia makes these all first class.

Anyways, good question. I hope my tangents were also informative.

14 Likes