Reading tfrecord files

I want to play a bit with the Youtube 8M dataset. Did somebody already work on reading tfrecord files?

I tried

julia> using TensorFlow
julia> it = TensorFlow.io.TFRecord.RecordIterator("train0111.tfrecord")
julia> first(it)
351069-element Array{UInt8,1}:
 0x0a
 0x23
 0x0a
 0x0e
 0x0a
 0x02
 ...

which I still have to parse with the right proto, I guess. Are there already some tools or generic approaches available for this?

I made one here: https://github.com/JuliaReinforcementLearning/TFRecord.jl It’s much easier than I thought. I really wish I had made it a year ago! :sweat_smile:

2 Likes