This function can convert different gene ID types within one species or between two species using the biomart service.
Usage
GeneConvert(
geneID,
geneID_from_IDtype = "symbol",
geneID_to_IDtype = "entrez_id",
species_from = "Homo_sapiens",
species_to = NULL,
Ensembl_version = NULL,
biomart = NULL,
mirror = NULL,
max_tries = 5,
verbose = TRUE
)
Arguments
- geneID
A vector of the geneID character.
- geneID_from_IDtype
Gene ID type of the input
geneID
. e.g."symbol"
,"ensembl_id"
,"entrez_id"
- geneID_to_IDtype
Gene ID type(s) to convert to. e.g.
"symbol"
,"ensembl_id"
,"entrez_id"
.- species_from
Latin names for animals of the input geneID. e.g.
"Homo_sapiens"
,"Mus_musculus"
.- species_to
Latin names for animals of the output geneID. e.g.
"Homo_sapiens"
,"Mus_musculus"
.- Ensembl_version
Ensembl database version. If NULL, use the current release version.
- biomart
The name of the BioMart database that you want to connect to. Possible options include
"ensembl"
,"protists_mart"
,"fungi_mart"
, and"plants_mart"
.- mirror
Specify an Ensembl mirror to connect to. The valid options here are
"www"
,"uswest"
,"useast"
,"asia"
.- max_tries
The maximum number of attempts to connect with the BioMart service.
- verbose
Whether to print the message. Default is
TRUE
.
Value
A list with the following elements:
geneID_res:
A data.frame contains the all gene IDs mapped in the database with columns:"from_IDtype"
,"from_geneID"
,"to_IDtype"
,"to_geneID"
.geneID_collapse:
The data.frame contains all the successfully converted gene IDs, and the output gene IDs are collapsed into a list. As a result, the"from_geneID"
column (which is set as the row names) of the data.frame is unique.geneID_expand:
The data.frame contains all the successfully converted gene IDs, and the output gene IDs are expanded.Ensembl_version:
Ensembl database version.Datasets:
Datasets available in the selected BioMart database.Attributes:
Attributes available in the selected BioMart database.geneID_unmapped:
A character vector of gene IDs that are unmapped in the database.
Examples
if (FALSE) { # \dontrun{
res <- GeneConvert(
geneID = c("CDK1", "MKI67", "TOP2A", "AURKA", "CTCF"),
species_from = "Homo_sapiens",
species_to = "Mus_musculus"
)
str(res)
# Convert the human genes to mouse homologs,
# and replace the raw counts in a Seurat object.
data(pancreas_sub)
counts <- GetAssayData5(
pancreas_sub,
assay = "RNA",
layer = "counts"
)
res <- GeneConvert(
geneID = rownames(counts),
geneID_from_IDtype = "symbol",
geneID_to_IDtype = "symbol",
species_from = "Mus_musculus",
species_to = "Homo_sapiens"
)
homologs_counts <- stats::aggregate(
x = counts[res$geneID_expand[, "from_geneID"], ],
by = list(res$geneID_expand[, "symbol"]), FUN = sum
)
rownames(homologs_counts) <- homologs_counts[, 1]
homologs_counts <- methods::as(
thisutils::as_matrix(homologs_counts[, -1]),
"dgCMatrix"
)
homologs_counts[1:5, 1:5]
} # }