JuliaLang:master
← JuliaLang:ob/ptrfree
opened 03:38PM - 22 Sep 16 UTC
The codegen/gc part of this is basically working.
I'm now wondering about seman…tics and I'd like us to discuss the following issues a bit before I clean up the code and we start the review (there is a bunch of duplicate paths in codegen that can be merged together/simplified and some things are plain wrong and/or inefficient).
This patch allows us to unbox most immutables. By unbox I mean : allocate/store them on the stack, inline them in other objects and inline them in arrays.
Why most ? There are (for now) two problems : cycles and #undef.
Cycles are a fundamental problem, if A has a field of type B and B of type A, we obviously can't inline them into each other. The cycle needs to be broken, the annoying part is that it should be done in a predictable way. For now, on this PR, it's done in DFS order which means that for example the layout of B will differ if we ever instantiated an A before. Not good. Proposal I remember about that (Jameson @ juliacon iirc) was to make types boxed iff they are part of any field cycle.
`#undef` is annoying because it makes a difference at the julia level between `isbits` types and other immutables. To minimize breakage I've gone the route of preserving the current behavior.
So if A has a pointer field and we make, e.g., an uninitialized array of A, this branch uses the nullity of the field of A as a marker that the corresponding slot in the array is #undef. This only works if the field of a valid instance of A can never be null, i.e., if `A.ninitialized >= field_index_of_the_ptr_field`.
This makes most code (at least all the test suite :-)) work but I think the following rules are really weird :
A type `T` will be inlined into fields/arrays and stack allocated if
- it is immutable
- it is not possible to reach itself through a sequence of field access
- it has at least one never-#undef pointer field or no pointer fields at all
The only difference between a type that is boxed or not is memory layout, but I'd assume that we want that to be easily predictable since for example people routinely interface with C.
A proposed alternative by Yichao was to make it entierly opt-in and error out if inlining was not possible. I'm worried this will lead to yet-another-annotation that people will sprinkle everywhere.
For performance, specially crafted tests (like summing lines of a very skinny matrix using subarrays) show some improvements by avoiding gc allocation. Not super satisfying for now and casual inspection of generated asm shows a lot of stack movement. We can work on that though, probably by improving llvm's visibility of our rooting mechanism and/or just using statepoints.
(to sweeten the deal I've thrown in improved undef ref errors)