First off all, thanks to everyone who has replied to my initial post. The first version of JSONLines.jl is now in the official repository. Since registering it I have been working on a lazy reader for e.g. larger than memory files (check out the dev docs) which will be the main addition for the 1.1.0 release.
Let me know what you think!
Cheers,
Daniel
9 Likes
I have just released JSONLines 1.2.0 and it’s all about StructTypes. readlazy
can now be used to return an interator over rows that are not parsed during iteration (set returnparsed = false
). This is useful a) if you want to parse rows with a different JSON parser and b) with the new function select(jsonlines, vars...)
which will only read the specified variables and return a vector of NamedTuple
s. This is done using StructType
s in the background which specify the variables to be parsed. Similarly readfile
accepts a StructType
via the structtype
kwarg. For convenience you can generate a Mutable StructType
using the @MStructType StructName var1 var2 ...
macro and then call readfile("file.jsonl", structtype=StructName)
to read var1, var2 ...
from the file.
Preview to 1.3.0:
In the dev docs you can find a new function readarrays(file)
which reads JSONLines files that have JSON arrays as rows. I plan to merge this into the readfile
function with an argument specifying the rowtype but I need to clean up the code for that a bit.
1 Like