Dear all,
this problem might be a bit too general, but I hope you can point me into the right direction to learn what’s going on behind the scenes and how to get out the best performance.
I am still a Julia newbie and did almost everything science related with Python/Cython/Numpy/Numba in the past years. I decided to switch, for obvious reasons.
So, I have a large dataset stored in an HDF5 file and another one which holds the information how the data needs to be split.
The main data is basically a stream of hits in a neutrino detector and the second dataset stores the indices and lengths of the events (aka time slices). To read an event or a bunch of events, I need to look up the hit indices and pull out the corresponding hits, while all hits of an event are stored continuously.
I usually have to deal with something like 900mio hits and about 25000 events per file, roughly.
The final goal is to create a high level access to the data, so that analysis and reconstructions are doable event-by-event with a vector of Hits
.
My very first baby step was defining a struct for the hits:
immutable Hit
channel_id
time
tot
end
And with kind help from the gitter channel (thanks again!), I got this function to be able to unpack the values of the matrices within a list comprehension:
rows(x) = (x[i, :] for i in indices(x,1))
And this is my attempt to create an array of Hit-vectors from a file:
function read_timeslices(filename, start, stop)
f = h5open(filename, "r")
ts_info = f["timeslice_info"][:, start:stop]' # index and length of the time slices
first = ts_info[1, 1] + 1 # index of first hit
last = ts_info[end, 1] + ts_info[end, 2] # index of last hit
data = f["timeslice_hits"][1:3, first:last]' # raw hit data as matrix
# this line creates 18mio allocations for 70000 hits
hits = [Hit(r...) for r in rows(data)] # create Hit vector
# slice hits into events,
[hits[x+1-first+1:x+y - first+1] for (x, y) in rows(ts_info)]
end
The function is of course totally slow. It takes a few seconds to read out a handful events.
The crucial line is the one with the Hit-vector creation:
hits = [Hit(r...) for r in rows(data)] # create Hit vector
Without that line and the following one, the readout is super fast (100seconds for 25000 events, which is the same I get in C++).
Now I am of course aware of the fact that the lines does a copy by value and need to allocate each hit. But why are there 18mio allocations?
I tried to dig further and looked at the @code_warntype
output, where I saw that many things are Any
and I don’t understand why Julia does not infer the types correctly. See the output at the end.
When I read in the datasets manually, I get Array{Int32, 1}
etc. but inside the function it seems to do no inference. So my guess is also that the hit-array is of course not an efficient Array{Hit,1}
.
I would really appreciate if anyone could give a hint where to start to solve this. Thanks in advance!
Variables:
#self#::#read_timeslices
filename::String
start::Int64
stop::Int64
f::HDF5.HDF5File
ts_info::Any
first::Any
last::Any
data::Any
hits::Any
#83::##83#84{first,hits}
#temp#@_12::LambdaInfo
#temp#@_13::Any
#temp#@_14::LambdaInfo
#1::##1#2{x}
Body:
begin
f::HDF5.HDF5File = $(Expr(:invoke, LambdaInfo for h5open(::String, ::String), :(Main.h5open), :(filename), "r")) # line 3:
SSAValue(0) = $(Expr(:invoke, LambdaInfo for o_open(::HDF5.HDF5File, ::String), :(HDF5.o_open), :(f), "timeslice_info"))
unless (Core.isa)(SSAValue(0),HDF5.HDF5Dataset)::Any goto 7
#temp#@_12::LambdaInfo = LambdaInfo for getindex(::HDF5.HDF5Dataset, ::Colon, ::UnitRange{Int64})
goto 12
7:
goto 9
9:
#temp#@_13::Any = (Main.getindex)(SSAValue(0),Main.:,$(Expr(:new, UnitRange{Int64}, :(start), :((Base.select_value)((Base.sle_int)(start,stop)::Bool,stop,(Base.box)(Int64,(Base.sub_int)(start,1)))::Int64))))::Any
goto 14
12:
#temp#@_13::Any = $(Expr(:invoke, :(#temp#@_12), :(Main.getindex), SSAValue(0), :(Main.:), :($(Expr(:new, UnitRange{Int64}, :(start), :((Base.select_value)((Base.sle_int)(start,stop)::Bool,stop,(Base.box)(Int64,(Base.sub_int)(start,1)))::Int64))))))
14:
SSAValue(1) = #temp#@_13::Any
ts_info::Any = (Main.ctranspose)(SSAValue(1))::Any # line 5:
first::Any = ((Main.getindex)(ts_info::Any,1,1)::Any + 1)::Any # line 7:
last::Any = ((Main.getindex)(ts_info::Any,(Base.size)(ts_info::Any,1)::Any,1)::Any + (Main.getindex)(ts_info::Any,(Base.size)(ts_info::Any,1)::Any,2)::Any)::Any # line 9:
SSAValue(2) = $(Expr(:invoke, LambdaInfo for o_open(::HDF5.HDF5File, ::String), :(HDF5.o_open), :(f), "timeslice_hits"))
SSAValue(3) = (Main.getindex)(SSAValue(2),$(Expr(:new, UnitRange{Int64}, 1, :((Base.select_value)((Base.sle_int)(1,3)::Bool,3,(Base.box)(Int64,(Base.sub_int)(1,1)))::Int64))),(Main.colon)(first::Any,last::Any)::Any)::Any
data::Any = (Main.ctranspose)(SSAValue(3))::Any # line 10:
hits::Any = (Main.parse_hits)(data::Any)::Any # line 11:
#83::##83#84{first,hits} = $(Expr(:new, :((Core.apply_type)(Main.##83#84,(Core.typeof)(first)::DataType,(Core.typeof)(hits)::DataType)::Type{_<:##83#84}), :(first), :(hits)))
SSAValue(4) = #83::##83#84{first,hits}
# meta: location In[4] rows 6
#1::##1#2{x} = $(Expr(:new, :((Core.apply_type)(Main.##1#2,(Core.typeof)(ts_info)::DataType)::Type{_<:##1#2}), :(ts_info)))
SSAValue(7) = #1::##1#2{x}
SSAValue(8) = (Main.indices)(ts_info::Any,1)::Any
# meta: pop location
SSAValue(5) = (Base.Generator)(SSAValue(7),SSAValue(8))::Base.Generator{I,F}
SSAValue(6) = (Base.Generator)(SSAValue(4),SSAValue(5))::Base.Generator{I,F}
return (Base.collect)(SSAValue(6))::Any
end::Any