Can I use exome data for CNV detection?

CNV calling from exome FASTQ is done automatically! A prerequisite for this is defining a Panel of Normals (PON) per each enrichment kit you're using.

A PON aids to set a baseline coverage pattern and account for recurrent technical artifacts that are specific to your workflow. Depth of coverage per each sequenced region is averaged across PON samples; if a significant increase or decrease from this baseline is detected in a test sample, a CNV is called.

Recommendations for creating a PON to call CNVs from exome data:

  1. Samples for a PON should be derived from healthy individuals.

  2. In our experience, a PON of at least 40-50 samples yields the best results. A smaller PON is better than nothing, but keep in mind that you may encounter more false positives.

  3. You should aim at preparing samples for a PON in a unified manner to avoid the batch effect. Please log differences in library preparation (if any).

Last updated