SequenceVariation: Brainstorming for new package

This seems good to me. A few thoughts, in no particular order:

  1. Does a Deletion have a length (eg, if I’m missing 3 bases, is that one Deletion or 3)? This matters for things like cost models in an alignment, but it sounds like that’s not the point of this package, so keeping all edits length 1 makes sense to me (it just means the biological interpretation isn’t stored in the data structure)
  2. Do you intend to have a low-cost way of switching which sequence is the reference? Imagine I start with seqX = "ATTGCT" as the reference. Then I add seqY = "ATTCTT" - two substitutions. Then I add seqZ = "ATTATT, which is 2 substitutions from seqX but only 1 from seqY. It would most parsimonious to switch to seqY as the reference, but I’m not good enough at computer science to know how costly an operation like this would be. It seems like, with 2 sequences, it should be essentially free to swap which is the reference, but as the number of sequences increases, it could get complicated. But I would think it could be so doable without recalculating everything.
  3. Some of this send like it will depend on BioAlignments.jl as well - is that already a dependency of GeneticVariation.jl?