Tables.jl, UpROOT.jl and the speed

EDITED: the post below shows the problem more clear.

it takes 0.16 s to read 4 branches (colunms)

and 2400 s when 16 branches (columns) are read
image

Would anyone guess of the origin of such behavior
I should be able to provide the root file for testing in spirit of that

The file has 300_000 rows, the size of 10MB

Some more retails are below

the Particle is a simple mutable struct, the process function creates a named tuple of for particles

using UpROOT
using Parameters
#
@with_kw mutable struct Particle
    E::Float64
    px::Float64
    py::Float64
    pz::Float64
end
#
process(row) = 
   (pb=Particle(
        px = row.pBeamX,
        py = row.pBeamY,
        pz = row.pBeamZ,
         E = row.pBeamE),
    pr = Particle(
        px = row.pRecoilX,
        py = row.pRecoilY,
        pz = row.pRecoilZ,
         E = row.pRecoilE),
    pπ = Particle(
        px = row.pPimX,
        py = row.pPimY,
        pz = row.pPimZ,
         E = row.pPimE),
    pη = Particle(
        px = row.pEtaX,
        py = row.pEtaY,
        pz = row.pEtaZ,
         E = row.pEtaE))

Pin @oschulz. Any ideas?
Thanks

Just found a typo in the second execution:
it should be process.(tr) instead of process.(t). Then it takes 1s

What remains unclear why processing the tree directly is so much slower. It contains
the only branches that I use

image

Well, here is the shortest representation of the problem:
image
A few orders of magnitude difference. Interesting.

Conversion of the TTree (in disk) to Table (in memory) is done when getindex is called.
It seems that the conversion is happening for every index when broadcast

Probably best if we discuss this on your GibHub issue (https://github.com/JuliaHEP/UpROOT.jl/issues/12), in case we need to make changes to UpROOT.jl

2 Likes