# Default region of interest kits

A region of interest (ROI) BED[^1] file determines which genomic regions will be included in the variant analysis. It functions as a preprocessing filter, determining which variants proceed to annotation and interpretation.

## Default ROI kits by case type

If no custom ROI BED kit is applied to a case, the system applies a default ROI BED file based on the case type. All default ROI BED files are available for download (see [Default ROI kit details](#default-roi-kit-details)).

<table data-full-width="false"><thead><tr><th width="163">Case type</th><th width="242">Default region of interest BED</th></tr></thead><tbody><tr><td><strong>Research Genome</strong></td><td><a data-footnote-ref href="#user-content-fn-2"><strong>None</strong></a></td></tr><tr><td><strong>Whole Genome</strong></td><td><a href="#full-genes"><strong>Full Genes</strong></a></td></tr><tr><td><strong>Exome</strong></td><td><a href="#clinical-regions"><strong>Clinical Regions</strong></a></td></tr><tr><td><strong>Custom Panel</strong></td><td><a href="#clinical-regions"><strong>Clinical Regions</strong></a></td></tr></tbody></table>

## Default ROI kit details

### **Full Genes**

A wide range of genomic regions BED file. It contains:

* "RefSeq ALL" transcripts and "GENCODE" full genes regions with 5Kbp upstream and 5Kbp downstream
* Within this range, all “Clinical Regions” are included
* All dosage regions (HI/TS sig level 1, 2 or 3)

Moreover, liftover versions of both reference regions were included, for the current and previous range versions.

#### **Sources:**

* Liftover done using CrossMap (v0.5.2), chain hg19ToHg38.over.chain.gz
* NCBI RefSeq regions are based on the release 105 (hg19) and 110 (hg38)
* Gencode regions are based on the release V19 (hg19) and V41 (hg38)
* All microRNA genes based on HGNC miRNA definition December 2022
* ClinGen Dosage region Dec 2022
* Promoters from EPDnew human version V6
* mtDNA CRS
* RNA disease genes based on OMIM and HGNC (Dec 2022): *ATXN8OS, TERC, IL12A-AS1, FAAHP1, NUTM2B-AS1, GAS8-AS1, RNU12, MIR204, IGHG2, SLC7A2-IT1, MIR99A, RMRP, XIST, MEG3, DIRC3, MIR17HG, GNAS-AS1, LRTOMT, LINC00299, DUX4L1, MIR137, MIR140, MIR605, SNORD118, RNU4ATAC, HELLPAR, IGHG1, IGHM, MIR19B1, RNU7-1, LINC00237, MIR2861, MIR4718, IGHV3-21, IGHV4-34, IGKC, KCNQ1OT1, MIR184, MIR96, H19, HYMAI, PCDHA9, UGT1A1, AFG3L2P1, DISC2, SNORA31, TRU-TCA1-1, PCDHGA4, TRAC, ECEL1P3, MIAT*
* ClinVar variants (ClinVar Dec 2022) with any pathogenic or likely pathogenic significance (and some drug responses that are affiliated with pathogenicity)
* 50K STR regions based on the DRAGEN 4.0 Specification file

{% hint style="info" %}
CNV variants are not confined to regions of interest.
{% endhint %}

#### Files

**Download files used in v100.39.0+**

{% columns %}
{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-41d15a2e16cf510c475259eeff7390361863172c%2FGRC38_full_genes.bed?alt=media>" %}
GRCh38 Full Genes v100.39.0+
{% endfile %}
{% endcolumn %}

{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-93c9e556ada8101a335bd8f40771ba417309472c%2FGRC37_full_genes.bed?alt=media>" %}
GRCh37 Full Genes v100.39.0+
{% endfile %}
{% endcolumn %}
{% endcolumns %}

**Download files used up to v38.0**

{% columns %}
{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-eaae20447144636f9287ed5e4cc374fe8fb3c9a0%2FGRC38_full_genes%20(1).bed?alt=media>" %}
GRCh38 Full Genes ≤v38.0
{% endfile %}
{% endcolumn %}

{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-6676829b9ce205a655dc7ddb791701d0c5f83dd1%2FGRC37_full_genes%20(1).bed?alt=media>" %}
GRCh37 Full Genes ≤v38.0
{% endfile %}
{% endcolumn %}
{% endcolumns %}

### Clinical Regions

This is a BED file that includes every clinically relevant region. The following are included:

* “RefSeq Curated” and “GENCODE” regions with flanking areas of 50bp from each side 5UTR and 3UTR region for protein coding genes (based on RefSeq)
* OMIM disease-related RNA genes (flanking 50bp)
* All Clinvar Pathogenic variants regions (flanking 50bp)
* Promoters region (EPDnew human version 006, flanking 50bp)
* Known STR regions (DRAGEN 4.0 specification file)
* All microRNA genes (flanking 50bp based on HGNC)
* Full mtDNA region

For consistency, the GRCh38 version includes the lifted over regions of GRCh37 (liftover using CrossMap).

{% hint style="info" %}
CNV variants are not confined to regions of interest.
{% endhint %}

#### Files

**Download files used in v100.39.0+**

{% columns %}
{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-54727b742cf6fb284dfa02b06d16f9795733bbd4%2FGRC38_clinical_regions.bed?alt=media>" %}
GRCh38 Clinical Regions v100.39.0+
{% endfile %}
{% endcolumn %}

{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-7861dba4945064c88c8f337f287c6aea4abf4b15%2FGRC37_clinical_regions.bed?alt=media>" %}
GRCh37 Clinical Regions v100.39.0+
{% endfile %}
{% endcolumn %}
{% endcolumns %}

**Download files used up to v38.0**

{% columns %}
{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-f75d399a8f7b34df00ad9b48335acdccaaaaab60%2FGRC38_clinical_regions%20(1).bed?alt=media>" %}
GRCh38 Clinical Regions ≤v38.0
{% endfile %}
{% endcolumn %}

{% column %}
{% file src="<https://1131024994-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FGCW0DnLlE7QjoZPNmKIi%2Fuploads%2Fgit-blob-14d38c46bcbcfac74bc1e5b407657f08692a7fa8%2FGRC37_clinical_regions%20(1).bed?alt=media>" %}
GRCh37 Clinical Regions ≤v38.0
{% endfile %}
{% endcolumn %}
{% endcolumns %}

[^1]: BED (Browser Extensible Data)—a text file format used to store genomic regions as coordinates and associated annotations

[^2]: All variants are displayed. Note that including all variants can significantly increase the data pipeline time compared to focusing solely on regions of interest. Additionally, annotating intergenic variants is limited.
