First of all, I hope package requests are appropriate here. I will try my best to make this request as constructive as can be. If requests are not appropriate, I apologize, and I won’t take it personally if the discussion gets closed.
I think it would be very beneficial to have a Julia package implementing reading and writing of files in the STAR format. One application of this format is to store image metadata information when processing cryogenic electron microscopy data, and it’s common to have STAR files with hundreds of thousands of lines, sometimes millions of lines. In this context, these files contain a lot of information, only a small subset of which is exposed to users of the software that generates these files; many interesting things can be done with programmatic access to these files (statistics, visualizations, etc.). Since these files store tabular data, it would make sense for the parser to produce an output that could be readily turned into a DataFrame (from DataFrames.jl), like what CSV.jl does.
There is a Python package to read STAR files into pandas dataframes and write such dataframes out to STAR files: starfile. So, I anticipate people will recommend using this package through PyCall.jl. This is probably possible, I have not tried, but it would be far from ideal since it entails having to manage a Python installation. The good part though is that this package’s license (3-clause BSD) allows drawing inspiration from it as much as one would need.
I would try to build a Julia package to read/write STAR files myself, but I am much too ignorant about too many things for this to be a tractable project: I don’t know enough computer science to know how to implement a parser, enough Python to understand how the
starfile package works, nor enough Julia to implement all this. So, if anybody would like to take on this project, I will be very happy to help: I can help design an API; I can provide STAR files for testing purposes, including large ones (couple hundreds of MB) to test for performance; maybe I can even do some coding if you can walk me though the logic of the implementation like I’m 5 and give me pointers (happy to read documentation any time, if it helps accomplishing this).
Thank you in advance!