1 of 100

Emedgene

Get Started with Emedgene

Get started with Emedgene

Welcome to Emedgene, where we unlock genomic insights for hereditary disease and streamline your tertiary analysis workflows.

So you've signed in and can't wait to get started? Here we will guide you through the platform architecture, case creation, and results review. You can dive a bit deeper by following the links and exploring manuals for the platform's applications:

—Genomic analysis workbench, where you can accession, interpret, curate and report on your cases, while also efficiently managing the lab workflow
—A repository for all of your organizational curated knowledge

How can Emedgene help you solve a case?

The AI-powered Emedgene platform utilizes machine learning throughout the analysis and interpretation workflow to deliver the fastest time from genomic data to decisions. We apply machine learning models that retrieve evidence-backed answers and provide exceptional decision support.

Using automated interpretation algorithms, Emedgene generates an accurate shortlist of up to 10 potential causative variants. In a joint study of 180 solved cases with Baylor Genetics, 96% of cases were successfully solved by the algorithm. See Meng et al, Genetics in Medicine, 2023 publication for more details.
The platform is not a black box, and overlays a layer of explainable AI (XAI), presenting supporting evidence from the literature and databases which significantly reduces the time to interpret a case.
The algorithms use a proprietary Emedgene knowledge graph which incorporates information extracted from literature with Natural Language Processing, as well as from public databases and is updated on a monthly basis.
Dozens of additional algorithms are incorporated throughout the workflow.

Overall, the system combines AI in a highly optimized and customizable workbench, in order to automate the most time-intensive aspects of genomic analysis and research.

Emedgene Analyze manual

Getting around the platform

Top navigation panel

The top navigation panel serves as a guide to the platform. It includes:

Case search bar
tab
tab

Emedgene applications menu

The Emedgene platform is divided into two applications:

Analyze—genomic analysis workbench
Curate—the knowledge management system

To switch from Analyze to Curate:

Go to the nine-dot app launcher icon located on the and select Curate from the dropdown menu.

To switch from Curate to Analyze:

Go to the nine-dot app launcher icon located on the Curate navigation panel and select Analyze from the dropdown menu.

Dashboard tab

The Dashboard tab depicts an overview of the user activity on the Emedgene platform and provides a glance at key performance indicators for an organization.

Lefthand panel

Diagnostic Yield card presents the proportion of "solved" cases out of the total number of the organization's cases of the same type.
Status Diagram card displays the total number of the organization's submitted cases as well as the numbers of cases under each status.
Stale Cases card highlights the cases that are stuck at one of the intermediate stages of the analysis, and are not finalized.

Righthand panel

Network Activities panel displays a timeline of activities performed by multiple users within the organization. This log includes activity like creating a case, verifying a filter preset, changing a , generating a report, and more.

Cases tab

The Cases tab provides an overview of genomic sequencing cases submitted by the organization, as well as individual case details.

The Cases tab includes:

Cases table—displays a list of cases along with key details
—enables customization of the table view, including grouping and filtering of cases
—opens when a case is selected, providing additional information

Cases table navigation panel

The Cases table navigation panel provides several tools to help you customize your table view and manage cases. It includes the following components:

menu Use this to narrow down the list of cases.
menu Organize your cases by case status

Case details

The Case details panel provides comprehensive information about a particular case.

The Case details panel is organized into three tabs:

Case info—displays technical, operational, and clinical information about the case
Family tree—shows a graphical pedigree and sample details for each family member
Activity—provides a timeline of all actions taken within the case for audit and collaboration

How to access the Case details panel

From the

Click on the row of the case you want to view. A pop-up side Case details panel will appear on the right. To close the panel, click the X icon in the top right corner.

From an

To expand the Case details panel, click the left-pointing arrow icon on the right edge of the screen. To collapse it, click the right-pointing arrow icon at the top left of the panel.

Case info

The Case info tab includes the following information:

Case ID—a unique identifier assigned to each case by Emedgene, formatted as EMGXXXXXXXXX
Case type—the type of analysis performed:

Family tree

The Family tree tab includes the following information:

Pedigree diagram. Pedigree legend can be found here.
Sample details for each family member:
- Phenotypes. For family members other than the test subject, phenotypes are categorized as:
  - Related—directly match one of the proband’s phenotypes
  - Unrelated—do not match any of the proband’s phenotypes
- Medical Condition – Indicates whether the individual is considered Healthy or Affected in the case
- Sex. Specified by the user
- Age. Automatically calculated in years based on the provided date of birth
- Maternal and Paternal —ethnic background of the proband’s parents
- BAM file location. Shown where relevant

How to open a case

To open a case:

A. Hover over the corresponding row in the Cases table and click on the Open case link next to the Case ID in the first column

B. Alternatively, double-click the row

How to customize Cases table view

How to select columns to be displayed

Click Fields

In Fields menu, use the toggle switch next to each field name to show or hide columns based on your preferred view

B. Hide a column directly from the Cases table

In the Cases table, click the column title you want to hide

From the dropdown menu, select Hide column

How to change column order

You can reorder columns in three ways.

A. Drag and drop the column

Hover over the column title

Click the six-dot icon that appears on the left to the title

Drag and drop the column

Click Fields in Cases table navigation panel

In Fields menu, hover over the field name

Click the six-dot icon that appears on the left to the title

Click the column header

From the dropdown menu, select Move left or Move right

How to adjust column width

Hover over the left or right border of the column header cell

When the resize cursor appears, click and drag the border to your desired width

How to filter cases

Available filters

You can filter cases using the following :

Case ID

How to search for cases

You can use the Case search tab in the top bar to search for cases by the Case ID or Proband ID.

How to group cases

To organize cases by status, navigate to the Cases table, click on Group on the navigation panel, and select Status. To remove the grouping, select None.

How to sort cases

You can sort cases by Creation date, Due date, or Quality.

To sort cases:

A. Hover over the column header and click the up or down arrow to sort in ascending or descending order

B. Alternatively, click the column name and select Sort ascending or Sort descending from the dropdown menu

The current sort direction is indicated by a single arrow icon next to the column name.

How to delete cases

In order to prevent accidental data loss, deleting cases in Emedgene includes a staging step before permanent case deletion.

Move a case to trash

to Move to trash (≤v37.0) or Trash bin

Help

Click on the icon in the top navigation panel to open the Help dropdown menu.

From there, you can access:

Help Center: Find feature guides, step-by-step instructions, and tips to help you get the most out of the platform.
Walkthroughs: View short interactive demos of workflows (in development).
Feature requests: Share your ideas and feedback.
What's new: Stay updated with the latest release notes.
About: View general information such as your organization name and platform version.

Okta identity management

The Emedgene platform utilizes the Okta Identity Management solution to control user access. This improves user management, enhances access and authentication security, and allows organizations to implement single sign-on for their users.

Managing data storage

Manage ICA storage

Prerequisites for managing ICA storage

To manage ICA storage, the user must have:

The Storage Provider

Storage providers

Launching analysis

Create a family tree

Add new case page > Family tree screen > Create family tree panel

Build a pedigree via the visual tool.

It is ideal that a proband selected for case analysis is affected and has disease phenotype(s).

You can add a Father, a Mother, a Sibling, or a Child to any family member, starting with the Proband. To do this, choose their icon, then click on the Add family member button in the bottom right corner of the pedigree builder to select a family member.

More information about the pedigree symbols can be found here.

To delete a family member, choose their icon, then click on the Delete Subject button in the top right corner of the Add patient information panel.

Note: There is no technical limit on the size or number of generations for a family tree.

Add a sample

> Family tree screen > Add patient information panel > Add sample section

You can choose one of the following options:

Existing sample: Pick one of the samples already loaded on the platform
Upload new sample: Upload files from your PC and enter sample name

Sequencing information

Select a coverage BED

A coverage BED file is used to calculate and determine quality control (QC) metrics for your case. This file defines the genomic regions that should meet coverage requirements during sequencing.

BED files defining custom kits can be added in Organization settings > . Furthermore, the BED file chosen here is linked to a PON (Panel of Normals) file when starting from FASTQs and conducting CNV calling.

After selecting a coverage BED file, the available reference sequences for this kit will be displayed.

Specify sample preparation details

Specify details such as laboratory name, sequencing machine used, sequencing reagent kit, and expected coverage.

Gene list

> Case info screen > Select genes list

Select gene list

You can limit analysis to a gene list in the platform while creating a case. Choose between:

Preset group

You can implement different combinations of to be used for different case types (i.e. Presets for exome may be different from Presets for genome) as defined by your SOPs to further streamline case review.

The combination of Presets is referred to as a Preset group.

Select a Preset group to display in the case

Preset group selection is available in the Case info screen of the flow while or

Labeling a case

You have the flexibility to manage Case labels at any time: create, add, or remove them directly in the Cases table.

Adding labels to a case provides the ability to quickly mark cases for specific use cases and an easy filtering of cases sub set in the cases page.

Creating multiple cases

Batch case upload from platform

If you're comfortable with scripting and API usage, you can upload multiple cases at once using those methods. But if you're not a technical expert, don't worry. There is a user-friendly alternative available—importing a CSV file directly through the user interface.

Please follow the steps as described below.

Caution: Please note that refreshing or leaving the page, exiting the Add new case tab, or power failure of your computer before you've completed a batch case upload will result in loss of the case creation progress.

Batch case upload via CLI

Prerequisites

Download and install node js platform via Minimum version required: 16 Upgrade existing installation: nvm install --lts

Formatting DRAGEN MANTA VCFs for Emedgene

For DRAGEN versions earlier than 4.2, when ingesting a DRAGEN Manta VCF containing SVs of type INS, replace the following line in the VCF header:

with

Example:

Replace

with

Tertiary analysis pipeline

Supported reference genome assemblies

Both GRCh37/hg19 and GRCh38/hg38 are supported. You can run cases with both reference genomes in the same organization.

Note: Curated and historical data are automatically lifted over on the fly.

Joint calling in Emedgene

Classic joint calling consists of calling variants "simultaneously across all sample BAMs, generating a single call set for the entire cohort." ()

When running from BAM or FastQ samples on Emedgene, we do not apply a classic joint calling but a BAM look-up methodology.

This methodology consists of retrieving coverage information from BAM during the VCF merging process. Thus, if a variant does not exist in a parental sample, the algorithm will check the coverage in that position using data from the BAM file. The position will be considered as "REF" allele if it is covered (depth > 3), and "No coverage" or "N/A" (./. in the VCF FORMAT/GT field), if it is below that threshold or has no coverage.

This process involves the creation of a “genome coverage” file as a separate preliminary step. The coverage file could also be provided via a BED or a gVCF file.

BAM look-up approach is slightly different from classic joint calling used by the joint calling option in DRAGEN and other variant callers, and therefore will not produce identical results.

Emedgene annotations and update frequency

Every case is annotated with the attached table of resources, including proprietary Illumina prediction scores PrimateAI-3D and SpliceAI. All annotations are versioned, and versions recorded in a Versions tab, and saved per case. Key variant significance and knowledge graph databases are updated monthly, so that the most up-to-date information is available during analysis.

4MB

Illumina_Connected_Software_Emedgene_Annotation_Schema.pdf

PDF

Open

Integrating variant annotations from multiple sources

The Emedgene pipeline prioritizes variant annotations based on the calling methodology rank order. The first appearance of a variant is annotated according to the following hierarchy:

TARGETED
STAR_ALLELE

Processing multi-nucleotide variants

Unlike single-nucleotide variants (SNVs), a multi-nucleotide variant (MNV) represents a single event involving multiple consecutive bases. In Emedgene, small variants are recognized as those comprising an MNV if they are located within a 2-nucleotide distance.

Limitations

Annotations from organization databases

Annotations from organization databases appear in various parts of the platform, each showing certain details.

Historic and noise databases

Variant table

Reviewing a case

Individual case page

The user can enter a specific case from the by clicking Full details in the corresponding row of the case table.

The Individual case page includes:

—displays a Case ID and and includes Case interpretation, Edit case info, and Report preview buttons

Case status

The case status reflects the current stage of case processing, either by the Emedgene platform or a genomic analyst.

Case statuses help teams:

Monitor case progress
Track ownership

How to update a case status

A. On the

In the top bar of the individual case page, click the dropdown icon next to the current case status

Individual case page: Top bar

The Top bar in the Individual case page indicates the Case ID and current .

Options available through the Top bar:

Change the

Most Likely Candidates and Candidates

To streamline case review, the AI Shortlist pre-selects the list of variants likely to be causative for each case:

Most Likely Candidates

Summary dashboard

Summary dashboard provides a quick overview of key quality indicators at both the case and sample levels.

Included metrics:

Case quality Displays the overall case quality status
Reflects sample quality status
Evaluation kit
Specifies the QC BED kit used to evaluate coverage depth and breadth. If no kit is specified at analysis launch, NCBI RefSeqGene is used as the default reference
Custom gene coverage Indicates whether the coverage of genes in the selected panel meets the expected threshold, as defined by the QC BED
Displays the results of relationship validation, confirming whether the submitted pedigree aligns with genetic data

Sequencing lab information section

Sequencing lab information section reports sequencing run technicalities as indicated during case creation:

Lab
Instrument
Reagents
Kit type
Expected coverage
Protocol

NGS sample quality metrics

NGS sex validation

The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation. This helps identify potential sample mix-ups or metadata errors before interpretation begins.

Sex validation results:

Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.

Sex validation is performed by comparing the observed homozygous/heterozygous genotype ratio on the X chromosome with the expected ratios:

<2 for females
>2 for males

Prerequisites:

Only high-quality SNVs from targeted regions—either kit-specific or RefSeq coding regions—are used for sex validation
A minimum of 50 variants is required to generate a reliable result. If this threshold is not met, sex validation cannot be performed, and no result is displayed

If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.

Ploidy

The Ploidy column provides results from the DRAGEN Ploidy Estimator, which is designed to detect aneuploidies and determine the sex karyotype in whole genome cases.

Ploidy estimation results:

Pass All autosomes fall within the expected ploidy range.
Fail
At least one autosome shows a median score outside the expected thresholds (below 0.9 or above 1.1).

Coverage

Coverage metrics for a target region defined by a QC BED file (or RefSeq coding regions if no kit is provided) included in the Sample quality section:

Average coverage Average depth of coverage for a target region
% Bases with coverage >10x percentage of a target region that is covered at a minimum depth of 10x
% Bases with coverage >20x percentage of a target region that is covered at a minimum depth of 20x

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Percentage of mapped reads

Percentage of reads mapped to the reference sequence.

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Sequencing error rate

Sequencing error rate refers to the frequency at which incorrect base calls are made during sequencing process.

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Array sample quality metrics

Array sample quality

The Quality status provides a quick assessment of array data reliability for each sample:

High
Call rate ≥ 0.99 and Log R dev ≤ 0.2
Low If either condition is not met
N/A
If the QC file not available

Use the Quality status to quickly screen whether a sample meets minimal QC thresholds before starting detailed interpretation.

Array sex validation

Sex validation results:

Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.

If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.

CNV overall ploidy

The CNV overall ploidy field displays the ploidy value extracted from the CNV VCF header. If no CNV VCF file is provided, "N/A" is displayed.

Displayed to three decimal places.

The value is shown as is. The system does not validate or flag abnormal ploidy values. Interpret ploidy in context.

Autosomal call rate

The Autosomal call rate field displays percentage of loci on the array for which a genotype call was successfully made, that only includes autosomes.

A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.

Displayed to three decimal places.

Call rate

The Call rate field displays the percentage of loci on the array for which a genotype call was successfully made.

Call rate is one of the key metrics used to determine array sample quality, alongside log R deviation.

A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.

Displayed to three decimal places.

Log R deviation

The Log R Deviation (or Log R Ratio standard deviation) quantifies the variability of the the signal intensity for each SNP marker on an array, ie, noise level.

Log R deviation is one of the key metrics used to determine array sample quality, alongside call rate.

Lower values indicate more consistent signal intensities. A high Log R Deviation can indicate a poor-quality sample or potential issues with CNV calling.

Displayed to three decimal places.

DRAGEN QC report

The is generated by the Illumina DRAGEN Bio-IT Platform and covers the entire analysis workflow—from raw sequencing reads to variant calls.

DRAGEN QC report formats

Interactive HTML summary A visual summary that includes interactive plots of key quality metrics. This report can be from the Sample quality section of the Lab tab.

Prerequisites for accessing the DRAGEN QC report

NGS case

Option 1: FASTQ case

Review interactive DRAGEN report

When available, a DRAGEN report link appears below the sample name in the Sample quality section of the Lab tab. Clicking the link opens the detailed quality control metrics report in a new browser tab. This integration allows users to quickly assess sequencing quality and confidently interpret results—without leaving the Emedgene interface.

Download DRAGEN QC metrics files

Sample-level for all samples in a case can be downloaded by clicking the download icon next to the Sample quality section title.

For NGS cases, the report includes coverage and mapping statistics.

For array cases, metrics include array QC values such as call rate, autosomal call rate, and Log R dev.

Emedgene

Get Started with Emedgene

Get started with Emedgene

How can Emedgene help you solve a case?

Emedgene Analyze manual

Getting around the platform

Top navigation panel

Emedgene applications menu

hashtagTo switch from Analyze to Curate:

hashtagTo switch from Curate to Analyze:

Dashboard tab

hashtagLefthand panel

hashtagRighthand panel

Cases tab

hashtagThe Cases tab includes:

Cases table navigation panel

Case details

hashtagHow to access the Case details panel

hashtagFrom the

hashtagFrom an

Case info

Family tree

How to open a case

hashtagTo open a case:

How to customize Cases table view

hashtagHow to select columns to be displayed

hashtagA. Show or hide columns via the Fields menu

hashtagB. Hide a column directly from the Cases table

hashtagHow to change column order

hashtagA. Drag and drop the column

hashtagB. Reorder columns via the Fields menu

hashtagC. Move a column using a dropdown menu

hashtagHow to adjust column width

How to filter cases

hashtagAvailable filters

How to search for cases

How to group cases

How to sort cases

hashtagTo sort cases:

How to delete cases

hashtagMove a case to trash

Help

Okta identity management

Managing data storage

Manage ICA storage

Storage providers

Launching analysis

Create a family tree

Add a sample

Sequencing information

hashtagSelect a coverage BED

hashtagSpecify sample preparation details

Gene list

hashtagSelect gene list

hashtag

Preset group

hashtagSelect a Preset group to display in the case

Labeling a case

Creating multiple cases

Batch case upload from platform

Batch case upload via CLI

hashtagPrerequisites

Formatting DRAGEN MANTA VCFs for Emedgene

Tertiary analysis pipeline

Supported reference genome assemblies

hashtag

Joint calling in Emedgene

Emedgene annotations and update frequency

Integrating variant annotations from multiple sources

Processing multi-nucleotide variants

hashtagLimitations

Annotations from organization databases

hashtagHistoric and noise databases

hashtagVariant table

Reviewing a case

Individual case page

hashtagThe Individual case page includes:

Case status

How to update a case status

hashtagA. On the

To switch from Analyze to Curate:

To switch from Curate to Analyze:

Lefthand panel

Righthand panel

The Cases tab includes:

How to access the Case details panel

From the

From an

To open a case:

How to select columns to be displayed

A. Show or hide columns via the Fields menu

B. Hide a column directly from the Cases table

How to change column order

A. Drag and drop the column

B. Reorder columns via the Fields menu

C. Move a column using a dropdown menu

How to adjust column width

Available filters

To sort cases:

Move a case to trash

Select a coverage BED

Specify sample preparation details

Select gene list

Select a Preset group to display in the case

Prerequisites

Limitations

Historic and noise databases

Variant table

The Individual case page includes:

A. On the

Options available through the Top bar:

To streamline case review, the AI Shortlist pre-selects the list of variants likely to be causative for each case:

Most Likely Candidates

Included metrics:

DRAGEN QC report formats

NGS case

Option 1: FASTQ case

Lefthand panel

Righthand panel

The Cases tab includes:

To switch from Analyze to Curate:

To switch from Curate to Analyze:

Available filters

To sort cases:

How to apply filters

How to remove a filter

How to clear all filters

To open a case: