I wanted to check/improve my understanding of allocations caused by type-instabilities.
Consider the following:
function f()
x = g()
y = h(x)
z = b(x)
end
g() = Any[1][1]
h(::Int) = 1
h(::Any) = Any[2][1]
b(::Int) = 3
b(::Any) = 4
Now for my questions regarding x, y, and z in f:
Does x get heap allocated because it is inferred as Any? Or do objects that normally get stack allocated only get moved to the heap for runtime dispatches (which g() is not a runtime dispatch)?
h(x) is a runtime dispatch, so as I understand it y must be heap allocated because the compiler doesn’t know what type y will be when compiling f. Is this understanding correct?
b(x) is also a runtime dispatch, but both methods of b return an Int. So is z heap allocated or stack allocated?
Are there any tools I can use to inspect allocations myself? For example, would @code_warntype or @code_llvm or even @time answer these questions for me?
Thanks in advance for any input. Also, I want to know technicalities, so even if one of my statements above might be considered “basically correct”, I would appreciate insights into any technicalities or nuances there might be. Better examples are also welcome.
In most cases your f() call wouldn’t allocate anything because the only input f is often a global constant known at compile-time, and the method is small and simple enough to do all the work there. @code_llvm f() is deceptively complicated because it’s ignorant of the call environment. If you call it from another method, none of the work is done at runtime:
julia> foo() = f()
foo (generic function with 1 method)
julia> @code_llvm foo()
; @ REPL[92]:1 within `foo`
; Function Attrs: uwtable
define i64 @julia_foo_1100() #0 {
top:
ret i64 3
}
julia> @btime foo()
1.000 ns (0 allocations: 0 bytes)
3
A variable can only be represented by stack-allocated data when the type, size, and lifetime are known at compile-time. If the compiler doesn’t know or cannot know all of those, then something goes on the heap. If the inferred x::Any needed to be worked on at runtime, then yes it would need a heap allocation. No, runtime dispatches aren’t the only reason for deciding whether something is allocated on the stack or heap.
@code_warntype also shows the y::Any inference. If it needs to be worked on at runtime, yes for the same reasons as x::Any.
It could be stack allocated. The compiler has an optimization for functions with very few methods that can be inferred to return the same type. The limit is low because inferring more methods takes more work.
Thanks for your input, @Benny. So, as I understand it, it’s not so much dynamic dispatch that causes allocations, but rather whether the compiler has enough information about an object to be able to put it on the stack.
In most cases your f() call wouldn’t allocate anything because the only input f is often a global constant known at compile-time, and the method is small and simple enough to do all the work there
Yeah, I was worried this example might be too simple. Thanks for pointing this out while still providing more explanation.
A variable can only be represented by stack-allocated data when the type, size, and lifetime are known at compile-time
Does this mean even small Unions have to be heap-allocated? Or would union-splitting allow for stack allocation?
Can you give an example of where type and size are known but lifetime is not?
Given the appropriate types in the union, it doesn’t seem like they have to be, in principle; more concretely, we can imagine separate places for each type or one place with the maximum size being read for different types. However, it does appear from the @code_llvm that heap allocations occur. Runtime benchmarking doesn’t catch heap allocations that occurred at compile-time, so absence at runtime doesn’t imply stack allocation; I also sometimes get extra allocations at runtime.
Mutables. Immutables could just be copied on their way out, so the stack being freed when the call finishes is not really an issue. But a mutable object needs to stay in one spot for all its references to easily find it across mutations. For that spot to be on the stack, every reference has to be inside the method and never make it out, not returned or external. Even when it is true, it can be infeasibly difficult or time-consuming to figure out, so on the heap it goes.