In CxxWrap, C++ objects are wrapped in a struct like this:
struct CxxPtr{T} <: CxxBaseRef{T}
cpp_object::Ptr{T}
end
Completely new objects are stored in a mutable version of the above struct, with exactly the same layout. On the C++ side, I use jl_new_struct to allocate the new struct, with as field argument the boxed pointer to store. However, this requires two allocations, so I tried to work around that using jl_new_struct_uninit, as seen here:
Unfortunately, this appears to trigger a segfault. So my question is: what is the correct way to allocate a struct like the above one, using only a single allocation (if possible), for both the mutable and immutable version (immutable is for boxing when needed)?
Do you have a MWE of the segfault? Many times I saw unexplainable segfault was because I was forgetting to protect a variable from the garbage collector.
Update: it seems that using memcpy to the directly dereferenced the pointer to the Julia object fixes it. Not sure what the difference is or if it is correct now, so having someone who is familiar with the Julia C code look at this would reassure me:
However, testing what avoiding the extra allocation actually brings in speed gain is making me question the usefulness of the change: the original version takes 0.0066 s to allocate 100000 C++ objects (2 Julia allocations per C++ object), while the new version takes … 0.0058 s, at indeed only one Julia allocation per C++ object.
@yuyichao Do you have any thoughts on this, please? Is the new version safe, or is tinkering at this low level simply not worth it?
I think previous crashes were caused by some kind of alignment-related UB, which I don’t understand. In Julia they seem to cast to the integer types because of that, or at least the git comments led me to this:
uint64_t val = *(uint64_t*)(&cpp_ptr);
memcpy(dest, &val, 8);
instead of:
memcpy(dest, &cpp_ptr, sizeof(T*));
That change was inspired by the implementation of jl_store_unaligned_i64, which is called by set_nth_field in jl_new_struct, and which were modified in the commit that fixed the issue I cited.
What troubles me is that I don’t understand the difference between the crashing and the working implementation, so I have no idea why this fixes it (or even if it fixes it properly and always).
Again, the only think I can think of is that you are triggering some UB (or compiler bug) and the best way to debug it would still be having a debugger attached to it one way or another.
And FYI, your “fixed” version made MORE assumptions about both alignment and aliasing, which is exactly the opposite of the julia commit you linked.
Possibly, part of the problem is that I don’t understand what those assumptions are and why they are needed, though. But the same casting to uint happens in Julia, why is that done?
When I had similar crashes locally, valgrind and the debugger didn’t point to anything special, except that accessing the created Julia object results in a crash, and creating it using the higher level jl_new_struct fixed it (or at least the crash went away), but introduced an extra allocation for boxing the pointer.
Because it is making the assumption that the alignment and type are correct. Also, there’s no casting to uint anywhere, the cast is on the pointer. These are just sligntly more optimized case of the default branch. They are there for the sole purpose of giving the compiler some hint to produce better code for the most likely usecase. It will be completely correct and should not crash without any branchs other than the default one.
Well, I’m not sure what do you mean by “debugger didn’t point to anything special”. I assume the debugger will point to where it crashes which will tell you what instruction and point it is and should allow you to inspect the condition it happens, the normal thing you’ll use a debugger to do.
Thanks for the clarification, and indeed after switching to the default branch implementation and also exactly the same implementation that crashed before, it doesn’t crash anymore. Which is actually bad news, because it means there is probably some kind of UB elsewhere.
What I mean with “nothng special” is that when I did manage to reproduce the crash in a debugger, it always segfaulted on this seemingly innocent line, and valgrind didn’t show anything invalid up to that point:
OK, I have inspected the value returned by both implementations, and it is always the same, I haven’t been able to reproduce the crash on Travis either anymore. My best guess at this point is that our discussion properly aligned the planets so it works now
I’ve merged the change, since this part of the code was probably never the cause of the crashes in the first place. Thanks for the help!