[BioSequences] data structure to keep mutations (delta) of a sequence

Hi

we are looking to a data structure to keep tract of the “evolution” of a sequence. What suggestions do you have in this respect? LongSubSeq views look useful, but they always refer to the the “main” sequence. What we need is to keep a “list” accumulating differences with a “reference” sequence. Maybe GenomeGrapsh.jl?

Thanks

All the best

Marco

2 Likes

You might be intersted in SequenceVariation.jl. There is a current discussion to change the API that also might be important to keep in mind.

2 Likes

Hi @Marco_Antoniotti, the current maintainer of SequenceVariation.jl here.

SequenceVariation.jl works exactly by storing a reference sequence and a vector of Edits that modify the sequence using the Haplotype type. The caveat where you might be concerned, is that the vector of Edits is always sorted by position, and won’t be able to store any “accumulation” of mutations to simulate evolution. You could work around that by translateing the more mutated sequences to the less mutated sequences, but there isn’t a very clean way to do that yet (in fact, that very idea is what sparked our discussion on how to restructure the API).

If you have any questions about or suggestions for SequenceVariation for your use case, hit me up, I’d be glad to help.

1 Like

Thanks. I would not mind joining in the discussion.

Let me catch up…

MA

Hi @MillironX

I read the discussion about the new interface you propose (Idea for reworking the type structure (adding abstract types) · BioJulia/SequenceVariation.jl · Discussion #43 · GitHub)

Let me say that I agree with you. First you get it right (and clean), and then you get it fast (e.g., using something like Home · Match.jl if you must; no need to evoke rusting concepts - after all Julia has decent macros).

I think the data structures may work, the issue is whether you can have an Edit of an Edit. If you could, we’d be home free.

I can put a student to work on this. Maybe we can also chat over Zoom/WebEx/Meet/Whatever.

All the best

Marco

1 Like