Contamination

The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.

Be mindful that when contamination is suspected in sequencing data, it could stem from various sources, including true contamination, sample mix-up, library preparation issues, or technical artifacts.

Always confirm the issue with other quality checks.

Contamination is detected using Peddy calculations, which estimate the proportion of reads that do not match the expected genotype. This estimate is based on the idr_baf score.

idr_baf stands for the interdecile range of the B-allele frequency—calculated as the difference between the 90th and 10th percentiles of the distribution of alt / (ref + alt) ratios across all variant sites.

A larger idr_baf value indicates greater variability in allele balance, which may suggest sample contamination, particularly from another human DNA sample.

Contamination check results:

  • N/A No data is available (older cases or when idr_baf = 0.000).

  • No No contamination detected (idr_baf < 0.200).

  • Unlikely Possible contamination, but evidence is weak (0.200 ≤ idr_baf < 0.241).

  • Likely Contamination suspected (0.241 ≤ idr_baf < 0.300).

  • Yes

    Contamination confirmed (idr_baf ≥ 0.300).

Last updated

Was this helpful?