I have a pair end bam. And I want to know whether the record is the first in pair or the second in pair. I use the julia package called XAM.jl.
using XAM
bamfile="..."
reader=open(BAM.Reader,bamfile)
mapped_chr1 = Iterators.filter(x->(BAM.refname(x)=="chr1"), reader)
for record in mapped_chr1
println(BAM.isprimary(record),':',BAM.flag(record),':',BAM.isnextmapped(record))
end
true:99:true
true:147:true
true:163:true
true:83:true
true:99:true
true:147:true
true:99:true
true:147:true
true:163:true
But it return all true. I checked flag 163 ,and it told me that it is second in pair.
How do I distinguish the first in pair and second in pair in pair end bam?
You’re on the right track, and Explain SAM Flags is a great resource.
You can query the flag properties using boolean algebra.
When a property is set, the flag value anded with the property value will equal the property value.
Using a flag of 163 as an example.
julia> flag = UInt8(163) # BAM.flag(record)
0xa3
julia> bitstring(flag)
"10100011"
The “read paired” property (0x1) can be checked by evaluating whether the bit in the first position (right most) is set to true.
julia> bitstring(0x1)
"00000001"
julia> flag & 0x1 |> bitstring
"00000001"
julia> flag & 0x1 == 0x1
true
The “read mapped in proper pair” property (0x2) can be checked by evaluating whether “10100011” & “01000000” == “01000000”.
julia> bitstring(0x2)
"00000010"
julia> flag & 0x2 |> bitstring
"00000010"
julia> flag & 0x2 == 0x2
true
The “read mapped in proper pair” property (0x40) can be check by evaluating whether “10100011” & “01000000” == “01000000”.
julia> bitstring(0x40)
"01000000"
julia> flag & 0x40 |> bitstring
"00000000"
julia> flag & 0x40 == 0x40
false
You may combine these properties with an ‘or’ operation to form one query.
julia> query = 0x1 | 0x2 | 0x40
0x43
julia> query |> string
"67"
julia> query |> bitstring
"01000011"
julia> flag & query == query
false
I thought the following video neatly linked the flag back to the SAM Format Specification.
To be honest handling flags should be easier, I have this piece of code that does a decent job. I think I opened an issue long time ago but can’t find it back. With this you can just do isflag(record, SAMFlags.READ1)
.
baremodule SAMFlags
export SAMFlag, flag
struct SAMFlag
flag::UInt16
description::String
end
flag(f::SAMFlag) = f.flag
const PAIRED = SAMFlag(UInt16(0x1),
"the read is paired in sequencing, no matter whether it is mapped in a pair")
const PROPER_PAIR = SAMFlag(UInt16(0x2),
"the read is mapped in a proper pair")
const UNMAP = SAMFlag(UInt16(0x4),
"the read itself is unmapped; conflictive with PROPER_PAIR")
const MUNMAP = SAMFlag(UInt16(0x8),
"the mate is unmapped")
const REVERSE = SAMFlag(UInt16(0x10),
"the read is mapped to the reverse strand")
const MREVERSE = SAMFlag(UInt16(0x20),
"the mate is mapped to the reverse strand")
const READ1 = SAMFlag(UInt16(0x40),
"this is read1 (first in pair)")
const READ2 = SAMFlag(UInt16(0x80),
"this is read2 (second in pair)")
const SECONDARY = SAMFlag(UInt16(0x100),
"not primary alignment")
const QCFAIL = SAMFlag(UInt16(0x200),
"QC failure")
const DUP = SAMFlag(UInt16(0x400),
"optical or PCR duplicate")
const SUPPLEMENTARY = SAMFlag(UInt16(0x800),
"supplementary alignment")
end
function list_flags(::Type{SAMFlags.SAMFlag})
n = names(SAMFlags,all=true)
v = map(x->getfield(SAMFlags,x),n)
out = SAMFlags.SAMFlag[]
for x in v
typeof(x) == SAMFlags.SAMFlag && push!(out,x)
end
out
end
isflag(r::BAM.Record,f::SAMFlags.SAMFlag) = (flag(r) & SAMFlags.flag(f)) == SAMFlags.flag(f)
function decode_flag(r::BAM.Record)
for f in list_flags(SAMFlags.SAMFlag)
if isflag(r,f)
println(f.description)
end
end
end
1 Like