Package 'grex'

Title: Gene ID Mapping for Genotype-Tissue Expression (GTEx) Data
Description: Convert 'Ensembl' gene identifiers from Genotype-Tissue Expression (GTEx) data to identifiers in other annotation systems, including 'Entrez', 'HGNC', and 'UniProt'.
Authors: Nan Xiao [aut, cre] , Gao Wang [aut], Lei Sun [aut]
Maintainer: Nan Xiao <[email protected]>
License: GPL-3 | file LICENSE
Version: 1.9
Built: 2024-08-26 03:47:20 UTC
Source: https://github.com/nanxstats/grex

Help Index


Remove Version Numbers in Raw GTEx (GENCODE) Gene IDs

Description

Remove the '.version' part in raw GTEx (GENCODE) gene IDs to produce Ensembl IDs.

Usage

cleanid(gtex_id)

Arguments

gtex_id

Character vector of GTEx (GENCODE) gene IDs

Value

Character vector of Ensembl IDs

Examples

gtex_id <- c("ENSG00000227232.4", "ENSG00000223972.4", "ENSG00000268020.2")
cleanid(gtex_id)

Gene ID Mapping for Genotype-Tissue Expression (GTEx) Data

Description

Map Ensembl IDs to Entrez Gene ID, HGNC symbol, and UniProt ID, with basic annotation information such as gene type.

Usage

grex(ensembl_id)

Arguments

ensembl_id

Character vector of Ensembl IDs

Value

This function returns a data frame with the same number of rows as the length of input Ensembl IDs, containing:

  • ensembl_id - Input Ensembl ID

  • entrez_id - Entrez Gene ID

  • hgnc_symbol - HGNC gene symbol

  • hgnc_name - HGNC gene name

  • cyto_loc - Cytogenetic location

  • uniprot_id - UniProt ID

  • gene_biotype - Gene type

The elements that cannot be mapped will be NA.

Examples

# Ensembl IDs in GTEx v6p gene count data
data("gtexv6p")
# select 100 IDs as example
id <- gtexv6p[101:200]
df <- grex(id)
# Rows that have a mapped Entrez ID
df[
  !is.na(df$"entrez_id"),
  c("ensembl_id", "entrez_id", "gene_biotype")
]

Ensembl IDs from GTEx V6 Gene Read Count Data

Description

A dataset containing the Ensembl IDs from GTEx (V6) gene read count data.

Usage

gtexv6

Format

A character vector with 56,318 Ensembl IDs.

Source

https://www.gtexportal.org


Ensembl IDs from GTEx V6p Gene Read Count Data

Description

A dataset containing the Ensembl IDs from GTEx (V6p) gene read count data.

Usage

gtexv6p

Format

A character vector with 56,238 Ensembl IDs.

Source

https://www.gtexportal.org


Ensembl IDs from GTEx V7 Gene Read Count Data

Description

A dataset containing the Ensembl IDs from GTEx (V7) gene read count data.

Usage

gtexv7

Format

A character vector with 56,202 Ensembl IDs.

Source

https://www.gtexportal.org