When using BSON, I am getting a seg fault when trying to load a saved model. Details below.
The model is originally saved with:
bson(savedir * "policy.bson", policy = final_policy)
Then, in a different module, I am trying to load the module with:
# I had to use @__MODULE__ here to get it to work properly,
# I guess because it comes from a different module
dict = BSON.load(deepCorrectionResults, @__MODULE__)
policy = dict[:policy]
This actually works. In fact, I can view the contents of the policy and even print the variables. However, a bit later, I get a seg fault. Eventually, I tracked it down to the above code. If I remove the above code, I do not get a seg fault.
1 more interesting detail. There is a very large array inside the policy. It is aproximately 360,000 x 6 Float64s, which is about 18MB. If I remove this array, it does not seg fault. Unfortunately, I need it so this is not a solution.
Seg fault:
signal (11): Segmentation fault
in expression starting at none:0
unknown function (ip: 0x7fe7ec90dd82)
ios_write at /buildworker/worker/package_linux64/build/src/support/ios.c:432
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:729
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/serialize.h:101 [inlined]
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:798
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/serialize.h:101 [inlined]
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:798
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/serialize.h:101 [inlined]
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:798
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:507
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/serialize.h:101 [inlined]
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:798
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:337 [inlined]
jl_serialize_module at /buildworker/worker/package_linux64/build/src/dump.c:361 [inlined]
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:675
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:507
jl_serialize_value_ at /buildworker/worker/package_linux64/build/src/dump.c:386 [inlined]
jl_save_incremental at /buildworker/worker/package_linux64/build/src/dump.c:2183
jl_write_compiler_output at /buildworker/worker/package_linux64/build/src/precompile.c:65
jl_atexit_hook at /buildworker/worker/package_linux64/build/src/init.c:211
repl_entrypoint at /buildworker/worker/package_linux64/build/src/jlapi.c:703
main at /buildworker/worker/package_linux64/build/cli/loader_exe.c:51
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_start at /home/tyleri/julia-1.6.2/bin/julia (unknown line)
Allocations: 50990915 (Pool: 50977170; Big: 13745); GC: 32
ERROR: LoadError: Failed to precompile <module> [top-level] to /home/tyleri/.julia/compiled/v1.6/jl_iVtzBO.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::Base.TTY, internal_stdout::Base.TTY, ignore_loaded_modules::Bool)
@ Base ./loading.jl:1385
[3] compilecache(pkg::Base.PkgId, path::String)
@ Base ./loading.jl:1329
[4] _require(pkg::Base.PkgId)
@ Base ./loading.jl:1043
[5] require(uuidkey::Base.PkgId)
@ Base ./loading.jl:936
[6] require(into::Module, mod::Symbol)
@ Base ./loading.jl:923
[7] include
@ ./Base.jl:386 [inlined]
[8] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
@ Base ./loading.jl:1235
[9] top-level scope
@ none:1
[10] eval
@ ./boot.jl:360 [inlined]
[11] eval(x::Expr)
@ Base.MainInclude ./client.jl:446
[12] top-level scope
@ none:1
So, my question is, has anyone experienced this before or know the solution? The issue is with BSON but the seg fault is delayed for some time. Is there a data limit for BSON? The code I am working with is not the cleanest so its possible the same code is run multiple times, which could mean there are 2 copies of this array in memory at once, but it should be no more then 2 since it overwrites it each time.
I will work on creating a minimum working example over the next few days, but hopefully someone has ideas to try based on the current information.