Title: | Gene ID Mapping for Genotype-Tissue Expression (GTEx) Data |
---|---|
Description: | Convert 'Ensembl' gene identifiers from Genotype-Tissue Expression (GTEx) data to identifiers in other annotation systems, including 'Entrez', 'HGNC', and 'UniProt'. |
Authors: | Nan Xiao [aut, cre] , Gao Wang [aut], Lei Sun [aut] |
Maintainer: | Nan Xiao <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 1.9 |
Built: | 2024-10-26 05:41:56 UTC |
Source: | https://github.com/nanxstats/grex |
Remove the '.version' part in raw GTEx (GENCODE) gene IDs to produce Ensembl IDs.
cleanid(gtex_id)
cleanid(gtex_id)
gtex_id |
Character vector of GTEx (GENCODE) gene IDs |
Character vector of Ensembl IDs
gtex_id <- c("ENSG00000227232.4", "ENSG00000223972.4", "ENSG00000268020.2") cleanid(gtex_id)
gtex_id <- c("ENSG00000227232.4", "ENSG00000223972.4", "ENSG00000268020.2") cleanid(gtex_id)
Map Ensembl IDs to Entrez Gene ID, HGNC symbol, and UniProt ID, with basic annotation information such as gene type.
grex(ensembl_id)
grex(ensembl_id)
ensembl_id |
Character vector of Ensembl IDs |
This function returns a data frame with the same number of rows as the length of input Ensembl IDs, containing:
ensembl_id
- Input Ensembl ID
entrez_id
- Entrez Gene ID
hgnc_symbol
- HGNC gene symbol
hgnc_name
- HGNC gene name
cyto_loc
- Cytogenetic location
uniprot_id
- UniProt ID
gene_biotype
- Gene type
The elements that cannot be mapped will be NA
.
# Ensembl IDs in GTEx v6p gene count data data("gtexv6p") # select 100 IDs as example id <- gtexv6p[101:200] df <- grex(id) # Rows that have a mapped Entrez ID df[ !is.na(df$"entrez_id"), c("ensembl_id", "entrez_id", "gene_biotype") ]
# Ensembl IDs in GTEx v6p gene count data data("gtexv6p") # select 100 IDs as example id <- gtexv6p[101:200] df <- grex(id) # Rows that have a mapped Entrez ID df[ !is.na(df$"entrez_id"), c("ensembl_id", "entrez_id", "gene_biotype") ]
A dataset containing the Ensembl IDs from GTEx (V6) gene read count data.
gtexv6
gtexv6
A character vector with 56,318 Ensembl IDs.
A dataset containing the Ensembl IDs from GTEx (V6p) gene read count data.
gtexv6p
gtexv6p
A character vector with 56,238 Ensembl IDs.
A dataset containing the Ensembl IDs from GTEx (V7) gene read count data.
gtexv7
gtexv7
A character vector with 56,202 Ensembl IDs.