Contamination
The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.
Contamination is detected using Peddy calculations, which estimate the proportion of reads that do not match the expected genotype. This estimate is based on the idr_baf score.
idr_baf stands for the interdecile range of the B-allele frequency—calculated as the difference between the 90th and 10th percentiles of the distribution of alt / (ref + alt) ratios across all variant sites.
A larger idr_baf value indicates greater variability in allele balance, which may suggest sample contamination, particularly from another human DNA sample.
Contamination check results:
N/A No data is available (older cases or when
idr_baf= 0.000).No No contamination detected (
idr_baf< 0.200).Unlikely Possible contamination, but evidence is weak (0.200 ≤
idr_baf< 0.241).Likely Contamination suspected (0.241 ≤
idr_baf< 0.300).Yes
Contamination confirmed (
idr_baf≥ 0.300).
Hover over the value to display a tooltip showing the HET ratio (proportion of sites that are heterozygous) and the HET count (number of heterozygote calls in sampled sites).
Tips:
Always review contamination results before starting interpretation to rule out technical issues that could explain unexpected variant calls.
Cross-check contamination results with other QC metrics (e.g., depth, ploidy, sex validation) for a more complete picture of sample quality.
For family cases, check that no contamination is flagged before relying on inheritance-based filters.
Warnings:
Panels may be less reliable: For targeted panels, contamination estimates may be inaccurate due to the limited number of variants available for calculation. Use caution and cross-check with other QC metrics when interpreting these results.
Do not use in isolation: A "Likely" or "Yes" result should not immediately be considered diagnostic — review case setup, sequencing quality, and sample handling first.
Last updated
Was this helpful?