Change record sequence in a BAM file using XAM

Hi,

I would like to update the sequence of a list of records in a BAM file. Looking at the code of record.jl, data is stored as Vector{UInt64} and some dark magic is used to convert to a DNA sequence.

As it looks very low level, how can I change the sequence ?

Thanks,

I don’t think it’s possible at the moment, you would have to write a function that does the inverse of BAM.sequence basically :

Maybe a workaround would be to write back to a SAM file and then convert to BAM with samtools. Looks like there’s string constructor for SAM :

Following your suggestion, I managed to make it work with:

function updateRecord(seq::BioSequences.LongDNA{4}, record::BAM.Record)
    seqlen = BAM.seqlength(record)
    src::Ptr{UInt64} = pointer(record.data, BAM.seqname_length(record) + BAM.n_cigar_op(record, false) * 4 + 1)
    for i in 1:length(seq.data)
        # Flip high and low nibble
        x = (seq.data[i] & 0x0f0f0f0f0f0f0f0f) << 4 | (seq.data[i] & 0xf0f0f0f0f0f0f0f0) >> 4
        unsafe_store!(src, x, i)
    end
end

Performances are good ! However, this only works for sequence substitution and not deletion/insertion.

1 Like