Given that you are having trouble with both these packages, I think you would be entitled to raise an issue directly at GitHub - JuliaLang/julia: The Julia Programming Language and see if you can get help there. Worth mentioning that both the packages wrap C/C++ code.
Julia 1.8.5 should be released any minute now.
Given that it works on 1.8.3, I would wait for the new release and see if that fixes it, and revert to 1.8.3 in the meantime.
Something in 1.8.4, 1.9 betas, and the current nightly build is causing a heap corruption (exit code: 3221226356) on Windows when using ccall. I have not been able to pinpoint the change that is causing this issue. I hope that 1.8.5 fixes it.
Here is a short code snippet to reproduce the error:
using XGBoost
x = rand(4,5)
o = Ref{XGBoost.DMatrixHandle}()
sz = reverse(size(x))
xp = convert(Matrix{Cfloat}, x)
missing_value=NaN32
XGBoost.xgbcall(XGBoost.XGDMatrixCreateFromMat, xp, sz[1], sz[2], missing_value, o)
Here is the function that is being called in XGBoost.jl:
function XGDMatrixCreateFromMat(data, nrow, ncol, missing, out)
@ccall libxgboost.XGDMatrixCreateFromMat(data::Ptr{Cfloat}, nrow::bst_ulong, ncol::bst_ulong, missing::Cfloat, out::Ptr{DMatrixHandle})::Cint
end
yes I just tried 1.8.5, and also observed the same problem: Julia crashes. And as you said @tylerjthomas9, it seems the problem persists in 1.9 betas.
Until now, I stayed to 1.8.3 for daily uses, to be able to use XGBoost and LIBSVM, and got no solutions from https://github.com/JuliaLang/julia/issues/48187. I assume that there are many users of XGBoost.jl and LIBSVM.jl under Windows; I don’t know if some have found another strategy.
Running julia under gdb from MSYS2, I ran the following Julia code:
using XGBoost
# training set of 100 datapoints of 4 features
(X, y) = (randn(100,4), randn(100))
# create and train a gradient boosted tree model of 5 trees
bst = xgboost((X, y), num_round=5, max_depth=6, objective="reg:squarederror")
I then got the following trace.
warning: Critical error detected c0000374
Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffbbf64f633 in ntdll!RtlIsZeroMemory () from C:\windows\SYSTEM32\ntdll.dl
(gdb) bt
#0 0x00007ffbbf64f633 in ntdll!RtlIsZeroMemory () from C:\windows\SYSTEM32\ntdll.dll
#1 0x00007ffbbf6583f2 in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#2 0x00007ffbbf6586da in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#3 0x00007ffbbf65e361 in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#4 0x00007ffbbf575bf0 in ntdll!RtlGetCurrentServiceSessionId ()
from C:\windows\SYSTEM32\ntdll.dll
#5 0x00007ffbbf5747b1 in ntdll!RtlFreeHeap () from C:\windows\SYSTEM32\ntdll.dll
#6 0x00007ffbbd5b9c9c in msvcrt!free () from C:\windows\System32\msvcrt.dll
#7 0x0000000002d6a0ef in unsigned long long xgboost::SparsePage::Push<xgboost::data::DenseAdapterBatch>(xgboost::data::DenseAdapterBatch const&, float, int) ()
from C:\Users\mkitti\.julia\artifacts\a1540ff6121e48fd4712006a269d6bf6bf8216e1\bin\xgboost.dll
#8 0x0000000002dbac97 in xgboost::data::SimpleDMatrix::SimpleDMatrix<xgboost::data::DenseAdapter>(xgboost::data::DenseAdapter*, float, int) ()
from C:\Users\mkitti\.julia\artifacts\a1540ff6121e48fd4712006a269d6bf6bf8216e1\bin\xgboost.dll
#9 0x0000000002e70883 in xgboost::DMatrix* xgboost::DMatrix::Create<xgboost::data::DenseAdapter>(xgboost::data::DenseAdapter*, float, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
from C:\Users\mkitti\.julia\artifacts\a1540ff6121e48fd4712006a269d6bf6bf8216e1\bin\xgboost.dll
#10 0x0000000002b555fb in XGDMatrixCreateFromMat ()
from C:\Users\mkitti\.julia\artifacts\a1540ff6121e48fd4712006a269d6bf6bf8216e1\bin\xgboost.dll
#11 0x000002248136cd09 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) b unsigned long long xgboost::SparsePage::Push<xgboost::data::DenseAdapterBatch>(xgboost::data::DenseAdapterBatch const&, float, int)
(gdb) s
Single stepping until exit from function _ZN7xgboost10SparsePage4PushINS_4data17DenseAdapterBatchEEEyRKT_fi,
which has no line number information.
operator new (sz=8) at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/new_op.cc:47
47 /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/new_op.cc: No such file or directory.
(gdb) s
50 in /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/new_op.cc
(gdb) s
58 in /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/new_op.cc
(gdb) s
0x00000000029a9f9d in unsigned long long xgboost::SparsePage::Push<xgboost::data::DenseAdapterBatch>(xgboost::data::DenseAdapterBatch const&, float, int) () from C:\Users\kittisopikulm\.julia\artifacts\a1540ff6121e48fd4712006a269d6bf6bf8216e1\bin\xgboost.dll
(gdb) s
Single stepping until exit from function _ZN7xgboost10SparsePage4PushINS_4data17DenseAdapterBatchEEEyRKT_fi,
which has no line number information.
operator delete (ptr=0x18d2233a620) at /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/del_op.cc:49
49 /workspace/srcdir/gcc-12.1.0/libstdc++-v3/libsupc++/del_op.cc: No such file or directory.
(gdb) s
0x00007ffb58ae68e0 in free () from C:\Users\mkitti\.julia\juliaup\julia-1.8.5+0.x64.w64.mingw32\bin\libstdc++-6.dll
(gdb) s
Single stepping until exit from function free,
which has no line number information.
0x00007ffbbd5b9c80 in msvcrt!free () from C:\windows\System32\msvcrt.dll
0x00007ffbbf64f633 in ntdll!RtlIsZeroMemory () from C:\windows\SYSTEM32\ntdll.dll
(gdb) bt
#0 0x00007ffbbf64f633 in ntdll!RtlIsZeroMemory () from C:\windows\SYSTEM32\ntdll.dll
#1 0x00007ffbbf6583f2 in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#2 0x00007ffbbf6586da in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#3 0x00007ffbbf65e361 in ntdll!RtlpNtSetValueKey () from C:\windows\SYSTEM32\ntdll.dll
#4 0x00007ffbbf575bf0 in ntdll!RtlGetCurrentServiceSessionId ()
from C:\windows\SYSTEM32\ntdll.dll
#5 0x00007ffbbf5747b1 in ntdll!RtlFreeHeap () from C:\windows\SYSTEM32\ntdll.dll
#6 0x00007ffbbd5b9c9c in msvcrt!free () from C:\windows\System32\msvcrt.dll
#7 0x000000000264d1f6 in xgboost::SparsePage::Push<xgboost::data::DenseAdapterBatch> (
this=this@entry=0x20d9e121ee0, batch=..., missing=<optimized out>,
missing@entry=nan(0x400000), nthread=<optimized out>)
at /workspace/srcdir/xgboost/src/data/data.cc:1074
#8 0x000000000269de8a in xgboost::data::SimpleDMatrix::SimpleDMatrix<xgboost::data::DenseAdapter>
(this=0x20d8a872100, adapter=0xb6a47fc460, missing=nan(0x400000), nthread=<optimized out>)
at /workspace/srcdir/xgboost/src/data/simple_dmatrix.cc:144
#9 0x0000000002751633 in xgboost::DMatrix::Create<xgboost::data::DenseAdapter> (
adapter=adapter@entry=0xb6a47fc460, missing=missing@entry=nan(0x400000),
nthread=nthread@entry=1) at /workspace/srcdir/xgboost/src/data/data.cc:917
#10 0x0000000002454a9d in XGDMatrixCreateFromMat (data=<optimized out>, nrow=<optimized out>,
ncol=<optimized out>, missing=nan(0x400000), out=0x20d909185d0)
at /workspace/srcdir/xgboost/src/c_api/c_api.cc:457
I’ve debugged this to the degree that I know how.
I produced a version of XGBoost_jll.jl with debugging symbols here:
Thanks @mkitti. For my understanding, does the problem comes finally from XGBoost.jl? Since it was suggested here and here that the problem was on the Julia side (and the problem is also observed for LIBSVM.jl).
Let me try to describe what I’m seeing above in plain English.
XGBoost is a C++ project that uses OpenMP for parallelization. When XGDMatrixCreateFromMat is invoked, the method unsigned long long xgboost::SparsePage::Push<xgboost::data::DenseAdapterBatch>(xgboost::data::DenseAdapterBatch const&, float, int) is eventually called. This contains a potentially parallel for loop via OpenMP.
The creation of this parallel for loop is created by interacting with libstdc++ and libgomp. Julia provides both of these libraries as part of its CompilerSupportLibraries_jll:
Some update to this these libraries seems to causing XGBoost to free some memory that should not be freed. Recent updates bumped this the underlying gcc to version 12.
Another change is the disabling of thread local storage on Windows builds:
The issue is thus not directly with Julia itself but rather with libraries that it provides.