Run metabolism pathway scoring
Usage
RunMetabolism(
srt,
assay = NULL,
group.by = NULL,
layer = "counts",
db = c("KEGG", "REACTOME"),
species = "Homo_sapiens",
IDtype = "symbol",
db_update = FALSE,
db_version = "latest",
convert_species = TRUE,
Ensembl_version = NULL,
mirror = NULL,
biomart = NULL,
max_tries = 5,
use_preparedb = TRUE,
method = c("AUCell", "GSVA", "ssGSEA", "VISION"),
backend = c("cpp", "r"),
cpp_strategy = c("sparse", "topk", "full"),
cpp_chunk_size = NULL,
minGSSize = 10,
maxGSSize = 500,
assay_name = "METABOLISM",
new_assay = TRUE,
seed = 11,
verbose = TRUE
)Arguments
- srt
A Seurat object.
- assay
Assay to use as expression matrix. Default is
DefaultAssay(srt).- group.by
Name of metadata column to group cells by. If
NULL, single-cell scoring. If provided, expression is averaged by group before scoring (cell-type level).- layer
Data layer to use, usually
"counts"for count matrix.- db
Databases to use for metabolism pathways. One or both of
"KEGG","REACTOME"."Reactome"is also accepted and treated identically to"REACTOME". Whenuse_preparedb = TRUE, gene sets are built via PrepareDB.- species
Species of the input data. The scMetabolism gene sets contain human gene symbols. When
speciesis not"Homo_sapiens"andconvert_speciesisTRUE, GeneConvert is used to map human genes to the target species via biomaRt homolog tables. Default is"Homo_sapiens".- IDtype
A character vector specifying the type of gene IDs in the
srtobject orgeneIDargument. This argument is used to convert the gene IDs to a different type ifIDtypeis different fromresult_IDtype.- db_update
Whether the gene annotation databases should be forcefully updated. If set to FALSE, the function will attempt to load the cached databases instead. Default is
FALSE.- db_version
A character vector specifying the version of the gene annotation databases to be retrieved. Default is
"latest".- convert_species
Whether to convert human gene symbols from the scMetabolism gene sets to the target species using GeneConvert. When
TRUE(default), genes are mapped via cross-species orthologs from Ensembl BioMart. WhenFALSE, only case-insensitive direct symbol matching is used.- Ensembl_version
An integer specifying the Ensembl version. Default is
NULL. IfNULL, the latest version will be used.- mirror
Specify an Ensembl mirror to connect to. The valid options here are
"www","uswest","useast","asia".- biomart
BioMart database name passed to GeneConvert. Default
NULLuses"ensembl". Other options:"protists_mart","fungi_mart","plants_mart".- max_tries
Maximum retry attempts for biomaRt connections in GeneConvert. Default is
5.- use_preparedb
When
TRUE, gene sets are built via PrepareDB which provides species-aware gene mapping via BioMart and KEGG/Reactome databases. This automatically handles gene symbol conversion for non-human species (e.g.,species = "Mus_musculus"→ mouse gene symbols in metabolism pathways). WhenFALSE, raw scMetabolism GMT files are downloaded and genes are matched case-insensitively with optional GeneConvert supplementation whenconvert_species = TRUE. genes and approximates zero ties,"topk"ranks only genes that can contribute to AUCell AUC, and"full"ranks all genes.- method
Scoring method, one of
"AUCell","GSVA","ssGSEA","VISION".- backend
Scoring backend.
"cpp"is the default for supported methods."r"uses the original R package implementation."cpp"currently supportsmethod = "AUCell",method = "GSVA", andmethod = "ssGSEA".method = "VISION"falls back to"r"whenbackendis not explicitly set. AUCell C++ scores may differ from the R backend when tied expression values are randomly ranked.- cpp_strategy
C++ AUCell ranking strategy.
"sparse"ranks non-zero- cpp_chunk_size
Optional cell chunk size for C++ GSVA kernels.
NULLor"auto"automatically chunks large matrices to reduce peak dense intermediate memory; positive values set the chunk size manually.- minGSSize
The minimum size of a gene set to be considered in the enrichment analysis.
- maxGSSize
The maximum size of a gene set to be considered in the enrichment analysis.
- assay_name
Name of the assay to store metabolism scores when
new_assay = TRUE. Default is"METABOLISM".- new_assay
Whether to create a new assay for metabolism scores when
group.by = NULL. Default isTRUE.- seed
Random seed for reproducibility. Default is
11.- verbose
Whether to print the message. Default is
TRUE.
Value
Returns a Seurat object. When group.by = NULL, stores scores in assay assay_name and tools.
When group.by is provided, stores in tools slot Metabolism_<group.by>_<method> for MetabolismPlot.
Examples
data(pancreas_sub)
pancreas_sub <- standard_scop(pancreas_sub)
pancreas_sub <- RunMetabolism(
pancreas_sub,
assay = "RNA",
layer = "counts",
db = c("KEGG", "REACTOME"),
group.by = "CellType",
species = "Mus_musculus",
method = "AUCell"
)
ht <- MetabolismPlot(
pancreas_sub,
group.by = "CellType",
plot_type = "heatmap",
topTerm = 10,
width = 1,
height = 2
)