Overview - Somatic Sentieon

The CGAP Pipelines module for somatic variant calling with Sentieon (https://github.com/dbmi-bgm/cgap-pipeline-somatic-sentieon) is our license-based option for calling Single Nucleotide Variants (SNVs), short Insertions and Deletions (INDELs), and Structural Variants (SVs) for Whole Genome Sequencing (WGS) Tumor-Normal paired data. The pipeline starts from matching analysis-ready bam files for a Tumor and a corresponding Normal (non-Tumor) tissue for the same individual. It can receive the initial bam files from either of the CGAP Upstream modules. The output of the pipeline is a vcf file with the variant calls that are unique of the Tumor.

Note: If the user is providing bam files as input, the files must be aligned to hg38/GRCh38 for compatibility with the annotation steps.

Docker Image

The Dockerfiles provided in this GitHub repository can be used to build public docker images. If built through portal-pipeline-utils pipeline_deploy command (https://github.com/dbmi-bgm/portal-pipeline-utils), private ECR images will be created for the target AWS account.

The image contains (but is not limited to) the following software packages:

  • Sentieon (202112.01)
  • samtools (1.9)

Pipeline Flow

Our implementation offers a one step end-to-end solution to run a Tumor-Normal analysis using the Sentieon TNscope algorithm as described here. We are using of a Panel of Normal (PON) vcf file generated from 20 unrelated samples from The Utah Genome Project (UGRP) as described here (https://cgap-annotations.readthedocs.io/en/latest/unrelated_references.html).

References

Sentieon.