Hello
I’m trying to parse a custom text file. It looks like the CSV.File command works well, but because of the custom nature of the file, that has a lot of extraneous text, I have to first read the file into memory, figure out which rows in the file contain the data that I need, and then pass the file to CSV.File to import, which then works well, but it requires me to read the file twice, once to identify the custom tags to indicate where the data is, and then again through CSV.File
I wonder if it’s possible to pass the strings that are read from the first read, into CSV.Files to process the rest the data the rest of the way.
The data looks something like this
By the way, this table is many hundreds of millions of rows. I tried to write a custom parser, but can’t get anywhere close to the performance of CSV.File so that’s why I’m going this route, but if there is another way to process this, suggestions would be appreciated.
I only want the data from TABLEX in this case, not TABLEY
STARTHEADER:
x
t
g
d
s
TABLEX
TABLESTART:
a\tb\tc
integer\tinteger\integer\tstring
1\t2\t3\ta
4\t5\t\6\tx
TABLEEND:
x
c
v
f
g
s
TABLEY
TABLESTART:
z\ty\tx
string\tstring\integer
a\tb\t1
x\ty\t2
TABLEEND_X: