I’m a Master’s student in Statistics. I’ve recently been researching ‘safe screening rules’ for accelerating Lasso solvers. These can improve computation times by orders of magnitude when the number of features is large.
I’d like to create a package implementing these methods in Julia to create a faster Lasso solver than the standard co-ordinate descent algorithm. The reason for my post is that, while I’m a fairly experienced programmer and I (think I) know how to write good, maintainable code, I have never contributed to open-source projects before. I’m aware that writing code for use by others comes with a lot of baggage that writing code for oneself does not, so I’m looking for advice, be it general tips on open source development or Julia-specific advice.
Some specific questions I have in mind (apologies if any of these are overtly silly!):
There’s a Lasso.jl package already, implementing the standard co-ordinate descent solver. Would it be best for me to implement my algorithm within this package - modelling the code style etc on this package - and submit a pull request? Or develop my own package, ‘FastLasso.jl’ or something?
In the former case, how do I convince the maintainers of the package that my code and the the mathematics are sound? I could definitely write up a mathematical report of sorts - would that be expected approach? And I’m guessing they’d want to see applications of my solver and the existing solver to the same regression problem so that the two solutions can be verified to be the same (and computation times compared)?