BAM Quality Control
Overview
To evaluate the quality of a bam file, different metrics are calculated using the custom script bamqc.py.
The metrics currently available are:
Mapping stats
- Total reads
- Reads with both mates mapped
- Reads with one mate mapped
- Reads with neither mate mapped
Read length
Coverage
Definitions
Mapping Statistics
The number of reads (not alignments) are counted as the number of unique read pairs (i.e., if a read pair is mapped to multiple locations it is only counted once).
Coverage
Coverage (=Depth of Coverage) is calculated as below:
{ (number of reads w/ both mates mapped) * (read length) * 2 + (number of reads w/ one mate mapped) * (read length) } / (effective genome size)
Here, the effective genome size is the number of non-N bases in the genome for WGS and an estimation of mappable space (exon and UTR regions) for WES.