Problem reading ROOT branch containing Strings with UnROOT.jl

Hello,
I’m using Geant4 to simulate some decays and I’m saving the result in a root file which I then read with UnROOT.jl. I have the following code:

using DrWatson
quickactivate(@__DIR__)

using UnROOT

f = UnROOT.ROOTFile(datadir("sims/root/ss", "ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.root"))

with structure
tree

I can successfully access branches except those containing Strings, such as the one called “Particle”, shown in the picture above. When I do:

UnROOT.LazyBranch(f, "G4Sim/Particle")

I’m getting the following error:

 Failed to show value:

TaskFailedException

nested task error: DimensionMismatch: expected input array of length 16, got length 12

Stacktrace:

[1] dimension_mismatch_fail(::Type{StaticArraysCore.SVector{16, UInt8}}, a::SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{Int64}}, true})

@ StaticArrays ~/.julia/packages/StaticArrays/85pEu/src/convert.jl:196

[2] convert

@ ~/.julia/packages/StaticArrays/85pEu/src/convert.jl:201 [inlined]

[3] StaticArray

@ ~/.julia/packages/StaticArrays/85pEu/src/convert.jl:174 [inlined]

[4] reinterpret(::Type{UnROOT.FixLenVector{16, UInt8}}, v::SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{Int64}}, true})

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/custom.jl:86

[5] #235

@ ./none:0 [inlined]

[6] iterate

@ ./generator.jl:47 [inlined]

[7] collect_to!(dest::Vector{UnROOT.FixLenVector{16, UInt8}}, itr::Base.Generator{Base.Iterators.PartitionIterator{Vector{UInt8}}, UnROOT.var"#235#236"{UnROOT.FixLenVector{16, UInt8}}}, offs::Int64, st::Int64)

@ Base ./array.jl:892

[8] collect_to_with_first!

@ ./array.jl:870 [inlined]

[9] collect(itr::Base.Generator{Base.Iterators.PartitionIterator{Vector{UInt8}}, UnROOT.var"#235#236"{UnROOT.FixLenVector{16, UInt8}}})

@ Base ./array.jl:844

[10] interped_data(rawdata::Vector{UInt8}, rawoffsets::Vector{Int32}, ::Type{UnROOT.FixLenVector{16, UInt8}}, ::Type{UnROOT.Nojagg})

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/custom.jl:90

[11] basketarray(f::UnROOT.ROOTFile, branch::UnROOT.TBranch_13, ithbasket::Int64)

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/iteration.jl:64

[12] basketarray(lb::UnROOT.LazyBranch{UnROOT.FixLenVector{16, UInt8}, UnROOT.Nojagg, Vector{UnROOT.FixLenVector{16, UInt8}}}, ithbasket::Int64)

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/iteration.jl:135

[13] #214

@ ~/.julia/packages/UnROOT/xaBnN/src/iteration.jl:486 [inlined]

[14] iterate

@ ./generator.jl:47 [inlined]

[15] _collect(c::UnitRange{Int64}, itr::Base.Generator{UnitRange{Int64}, UnROOT.var"#214#217"{UnROOT.LazyBranch{UnROOT.FixLenVector{16, UInt8}, UnROOT.Nojagg, Vector{UnROOT.FixLenVector{16, UInt8}}}}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})

@ Base ./array.jl:854

[16] collect_similar

@ ./array.jl:763 [inlined]

[17] map

@ ./abstractarray.jl:3285 [inlined]

[18] getindex(ba::UnROOT.LazyBranch{UnROOT.FixLenVector{16, UInt8}, UnROOT.Nojagg, Vector{UnROOT.FixLenVector{16, UInt8}}}, range::UnitRange{Int64})

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/iteration.jl:486

[19] (::UnROOT.var"#244#246"{UnROOT.LazyBranch{UnROOT.FixLenVector{16, UInt8}, UnROOT.Nojagg, Vector{UnROOT.FixLenVector{16, UInt8}}}})()

@ UnROOT ~/.julia/packages/UnROOT/xaBnN/src/displays.jl:95

Stack trace

Here is what happened, the most recent locations are first:

1. <mark>**wait**</mark> @ *task.jl:352*

2. <mark>**show**(io::IOContext{IOBuffer}, ::MIME{Symbol("text/plain")}, br::UnROOT.LazyBranch{UnROOT.FixLenVector{16, UInt8}, UnROOT.Nojagg, Vector{UnROOT.FixLenVector{16, UInt8}}})</mark> @ *displays.jl:97*

I’m on Julia 1.10.2 working on Pluto and I’m using the v0.10.31 version of UnROOT.

Can you upload the file somewhere?

Sure! I uploaded it here cos it’s a .root file and I couldn’t upload it here (apologies if I’m too ignorant).
Also I kept digging trying to understand more. I will put some information here in case it could be useful:

From python’s uproot, after reading the tree and calling

tree.typenames()

I got

{'fEvent': 'int32_t', 'ParentID': 'int32_t', 'TrackID': 'int32_t', 
'Particle': 'char*', 'MeanLife': 'double', 'Charge': 'double', 
'Process': 'char*', 'Edep': 'double', 'preTime': 'double', 'postTime': 'double', 'localTime': 'double', 'preKE': 'double', 'postKE': 'double', 
'preX': 'double', 'preY': 'double', 'preZ': 'double', 
'postX': 'double', 'postY': 'double', 'postZ': 'double', 
'preVolume': 'char*', 'postVolume': 'char*'}

From inspection of the branch in Julia I got:

br = f["G4Sim/Process"]
println(br.fLeaves)

UnROOT.TObjArray("", 0, Any[UnROOT.TLeafC
  fName: String "Process"
  fTitle: String "Process"
  fLen: Int32 16
  fLenType: Int32 1
  fOffset: Int32 0
  fIsRange: Bool false
  fIsUnsigned: Bool false
  fLeafCount: UInt32 0x00000000
  fMinimum: Int32 0
  fMaximum: Int32 16
])

Hope this can be useful. Please let me know if more information is needed.

I also tried to convert my TTree into RNTuple using PyROOT but as I’m not using pyroot, I failed miserably.

Edit: Including also the part of Geant4 I implemented to save the data

    G4AnalysisManager* analysisManager2 = G4AnalysisManager::Instance();

    analysisManager2->FillNtupleIColumn(0, eventNumber);
    analysisManager2->FillNtupleIColumn(1, parentID);
    analysisManager2->FillNtupleIColumn(2, trackID);
    analysisManager2->FillNtupleSColumn(3, name);
    analysisManager2->FillNtupleDColumn(4, meanLife);
    analysisManager2->FillNtupleDColumn(5, fCharge);
    analysisManager2->FillNtupleSColumn(6, proc);
    analysisManager2->FillNtupleDColumn(7, edepStep);
    analysisManager2->FillNtupleDColumn(8, preTime);
    analysisManager2->FillNtupleDColumn(9, postTime);
    analysisManager2->FillNtupleDColumn(10, localTime);
    analysisManager2->FillNtupleDColumn(11, preKE);
    analysisManager2->FillNtupleDColumn(12, postKE);
    analysisManager2->FillNtupleDColumn(13, prePoint_x);
    analysisManager2->FillNtupleDColumn(14, prePoint_y);
    analysisManager2->FillNtupleDColumn(15, prePoint_z);
    analysisManager2->FillNtupleDColumn(16, postPoint_x);
    analysisManager2->FillNtupleDColumn(17, postPoint_y);
    analysisManager2->FillNtupleDColumn(18, postPoint_z);
    analysisManager2->FillNtupleSColumn(19, preVolume);
    analysisManager2->FillNtupleSColumn(20, postVolume);
    analysisManager2->AddNtupleRow(0);

1 Like

if I manually get the rawdata and rawoffsets out, I see:

julia> String.(VectorOfVectors(debug[].rawdata, debug[].rawoffsets .+ 1))
846-element Vector{String}:
 "\x0fRadioactivation"
 "\aionIoni"
 "\aionIoni"
 "\x0fRadioactivation"
 "\x05eIoni"
 "\x0eTransportation"
 "\x0eTransportation"
 "\x0fRadioactivation"
 "\x05eIoni"
 "\x0fRadioactivation"
 "\x05eIoni"
 "\aionIoni"
 "\x0fRadioactivation"
 "\x05eIoni"
 "\x05eIoni"
 "\x05eIoni"
 "\x05eIoni"
 "\x05eIoni"
 "\x05eIoni"

Which seems mildly reasonable.

The bug seems to be that we’re interpreting this branch as FixLenVector instead of the normal offset jagged vector.

this is the root cause of the bug – in this case even though leafLen is > 1, somehow it’s not really fixed length?

according to ROOT, this should indeed be fixed length

I have added a PR that fixes this now. Fundamentally we treat this as Vector{Char} since that’s what TLeafC seems to be. Let’s continue the discussion in the PR.

Thank you!

That’s indeed a bummer. Thanks Jerry for the quick PR, looks good so far :slight_smile: