For the past (I honestly don’t know how many) months, I’ve been writing and rewriting my fork of
PopGen.jl (found here). The vast majority of the work involved was establishing a data structure and a million internal functions to facilitate common things that would need to be done for most kinds of analyses (e.g. allele frequency calculations).
Working on this package started as a passion-project to improve my Julia chops along with my understanding of the mathematical principals of population genetics, which I admit I’m not great at and have recruited Jason Selwyn to help with (great at math, not great at Julia). I just recently finished porting
basic.stats. from the
hierfstat package in R, and have been working on writing a series of helper functions to perform permutation tests for these F-statistics.
I’d hope to one day merge my fork of PopGen.jl into the BioJulia one, but I’d say it’s much too early for something like that. For the basic benchmarks that I’ve ran with what there currently is, PopGen.jl blows
adegenet out of the water (see the docs), which is kind of reassuring, although it’s the basics. It would be great to have some extra eyes to look at what’s been done so far and suggest changes/improvements, especially if it pertains to anything fundamental like the
PopData type itself or how genotypes are encoded. The
dev branches are dated, b/c
fstat is the one that I’m actively working on. My process is to make a branch for a particular task, when it’s done merge to
dev, repeat, then make sure everything works in
dev before merging with