Kit Management
Panels Of Normals (PON) (37.0+)
For detecting copy number variation (CNV) in targeted panel and exome cases, DRAGEN utilizes a Panel of Normals (PON) approach. This method leverages a set of matched normal samples to establish a reference baseline for CNV event detection.
A PON file is typically a text file listing absolute paths to 'target counts' files of individual matched normal samples. However, a PON can also be in the form of a combined counts file, which is a column-wise concatenation of individual target counts files (either GC-corrected or not).
The Illumina BaseSpace Baseline Builder App can generate this combined counts file (see the section "Creating a PON for Emedgene Using BaseSpace Baseline Builder").
Attaching a PON to a Kit
Starting from version 37, users can attach a PON to an existing Kit BED.
A new PON Management section is available in the Organization Settings page: PON Management.
This section includes a table displaying:
Kit Name
Kit ID
GC Corrected Status
Human Reference
DRAGEN Version
Maximum Interval Size
Additionally, there is an ‘Add PON’ button to initiate the PON addition process.\
Prerequisites
Before adding a PON, ensure the following requirements are met:
The PON file must be a combined counts file.
The file must be stored in a supported cloud storage (AWS S3, ICA, or BaseSpace Storage). Direct uploads are not allowed.
Users must have the “Manage PON” role.
The Enrichment Kit and its human reference BED must already exist. The BED file must match the one used to generate the PON.
Only one PON per unique combination of Enrichment Kit, Human Reference BED, and DRAGEN Main Version is allowed.
PONs cannot be added for DRAGEN sub-versions.
Compatibility
To use the PON for CNV calling, the sample pipeline version must be set to 37.0 or higher. Otherwise, CNV calling will not use the newly added PON.
Existing PONs
Previously existing PONs will not appear in the PON table. However, a notification will indicate the presence of existing PONs within the workgroup.
PON Migration
Previously existing PONs will continue to function, and CNV calling will remain unaffected. If migration to the new PON table is required, please contact techsupport@illumina.com or your bioinformatics support team.
Adding a new PON
Click ‘Add PON’ to open a pop-up window.
Select values for the required fields:
Enrichment Kit: Lists all unique kits within the organization (excluding common kits). To add a PON for a common Kit BED, create a separate Kit with the same Kit BED (refer to Kit Management for details).
DRAGEN Version: Only versions 3.6 and later are supported.
Human Reference: Choose GRCh37 or GRCh38.
Based on the selected DRAGEN version, the system will display the expected maximum interval size:
DRAGEN 4.2 and below: 250bp
DRAGEN 4.3 and above: 500bp (default for DRAGEN 4.3)
Click Next go to file selection window:\
On next window select a combined counts file from a supported cloud storage service (AWS S3, ICA, or BaseSpace Storage). Ensure the file is available in storage before proceeding.
Only one file with the extension “.combined.counts.txt.gz” can be selected.
Click ‘Next’ to validate the file.\
Automated Validations
Before adding a PON, the system performs the following checks:
1. Maximum Interval Size Validation
The system inspects the first 1000 rows of the combined counts file to ensure that no target exceeds the threshold interval size.
Using a combined counts file with the expected maximum interval size is strongly recommended.
2. GC Correction Validation
The system determines GC correction status by checking the cnv-enable-gcbias-correction field in the file headers:
0: Non-GC corrected
1: GC corrected
If this field is missing, the system analyzes target value types:
Integer values: Non-GC corrected
Float values: GC corrected
Users must ensure the combined counts file has the intended GC correction status.
Viewing Added PONs
Once a PON is successfully added, it appears in the PON table with details based on the selected inputs. Value for Maximum Interval Size and GC Corrected Status are inferred from validation results.\
Deleting PON from table
If user wants to delete a PON for a combination listed in table for any reasons, then user with role “Manage Pon” shall be able to delete it. This deletion will be a soft deletion i.e. linkage between combined counts file and Kit BED will be removed and there will be no impact on combined counts file itself.\
Creating a PON for Emedgene Using BaseSpace Baseline Builder
Introduction
As of Emedgene V37, users of Emedgene can supply their own panel of normal (PON) to enable CNV calling on gene panels and WES samples. This guide details the process using BaseSpace.
The DRAGEN Baseline Builder application in BaseSpace can be used to build PONs that are compatible with Emedgene:\
Users without BaseSpace can receive a free trial BaseSpace account along with compute (250 iCredits) and storage (1 TB) by registering here (https://basespace.illumina.com/). The compute is allocated for 30 days upon trial commencement and will be sufficient to generate a PON.
Alternatively, Illumina Connected Analytics (ICA), DRAGEN servers and DRAGEN in cloud are also capable of generating Emedgene compatible PONs.
Requirements
~ 50 normal samples 50/50 male:female split, originating from the same library prep protocol, ideally from the same sequencer.
The sample fastq files need to be in one or more Projects in your Basespace account.
iCredits
Compatible BED file
Uploading a BED file:
The BED file used must match that uploaded to Emedgene, any discrepancy may cause a failure during case processing. See Emedgene Help Center Articles for more information on creating an Enrichment Kit in Emedgene.
Uploading a BED file to a project in BaseSpace can be done within the project:\
Note that hg19/grch37 BED files are not directly compatible with hs37d5 (the reference used within Emedgene) and will require their region contigs being renamed from "chr1", "chr2", "chr3"... etc. to "1", "2", "3" etc.
Running the app
1) Select the application "DRAGEN Baseline Builder" with the latest version that matches your Emedgene secondary analysis pipeline (4.3 in example) and click launch:\
2) Select output project and Baseline Mode "CNV":
3) Select input FASTQ Biosamples to use. \
50 samples is a rough guideline as the degree of correlation between normals and case sample is more important than quantity.
4) Select correct reference genome, matching that used in Emedgene.
Chose the multigenome version of the genome build.
For GRCh38/hg38:\
For GRCh37/hg19 choose the hs37d5 build (remember to rename contigs in BED file if using hs37d5):
5) Select BED file you uploaded to your Basespace project in Step 2.1 above:\
6) Configure Emedgene specific settings:
Emedgene CNV calling has DRAGEN settings that must also be used during PON creation. In the app, this can be done in the advanced settings section at the bottom of the page:
Tick "Ignore Duplicate Reads in CNV Baseline Files" and change "Generated Combined Counts file for CNV" to "GC Corrected" 7) Tick the BaseSpace Labs App Acknowledgement box. \
8) Launch the app.
Downloading Results
Once the analysis is complete, open the output files and look for the *.combined.counts.txt.gz file. It may be within a folder called "pon".
It can be downloaded from analysis results manually, by clicking on the file and selecting "Download", via the Basespace CLI (https://developer.basespace.illumina.com/docs/content/documentation/cli/cli-overview), or if the output project is connected to Emedgene - loaded directly into the platform.
Last updated
Was this helpful?