Run cell-level quality control
Usage
RunCellQC(
srt,
assay = "RNA",
split.by = NULL,
group.by = NULL,
return_filtered = FALSE,
qc_metrics = c("doublets", "decontX", "atac", "outlier", "umi", "gene", "mito", "ribo",
"ribo_mito_ratio", "species"),
db_method = "scDblFinder",
db_rate = NULL,
db_coefficient = 0.01,
decontX_threshold = NULL,
decontX_batch = NULL,
decontX_background = NULL,
decontX_background_assay = NULL,
decontX_bg_batch = NULL,
decontX_assay_name = "decontXcounts",
decontX_store_assay = FALSE,
decontX_round_counts = TRUE,
decontX_args = list(),
atac_args = list(),
outlier_threshold = c("log10_nCount:lower:2.5", "log10_nCount:higher:5",
"log10_nFeature:lower:2.5", "log10_nFeature:higher:5", "featurecount_dist:lower:2.5"),
outlier_n = 1,
UMI_threshold = 3000,
gene_threshold = 1000,
mito_threshold = 20,
mito_pattern = c("MT-", "Mt-", "mt-"),
mito_gene = NULL,
ribo_threshold = 50,
ribo_pattern = c("RP[SL]\\d+\\w{0,1}\\d*$", "Rp[sl]\\d+\\w{0,1}\\d*$",
"rp[sl]\\d+\\w{0,1}\\d*$"),
ribo_gene = NULL,
ribo_mito_ratio_range = c(1, Inf),
species = NULL,
species_gene_prefix = NULL,
species_percent = 95,
seed = 11
)Arguments
- srt
A Seurat object.
- assay
The name of the assay to be used for doublet-calling. Default is
"RNA".- split.by
Name of a meta.data column used to split the object before QC. Default is
NULL. When specified, QC and doublet-calling are performed separately within each split object and merged back afterward.- group.by
Group labels passed to
RunDecontX()when"decontX"is included inqc_metrics. Can beNULL, a meta.data column name, or a vector aligned to cells. Default isNULL.- return_filtered
Logical indicating whether to return a cell-filtered Seurat object. Default is
FALSE.- qc_metrics
A character vector specifying the quality control metrics to be applied. Available metrics are
"doublets","decontX","atac","outlier","umi","gene","mito","ribo","ribo_mito_ratio", and"species". Default isc("doublets", "decontX", "outlier", "umi", "gene", "mito", "ribo", "ribo_mito_ratio", "species"). ForChromatinAssay, if.arg qc_metricsis not supplied, the default is"atac".- db_method
Method used for doublet-calling. Can be one of
"scDblFinder","Scrublet","DoubletDetection","scds_cxds","scds_bcds","scds_hybrid". The resulting doublet labels are aggregated afterward intodb_qcand do not affect the thresholds used by the other QC metrics.- db_rate
The expected doublet rate. Default is calculated as
ncol(srt) / 1000 * 0.01.- db_coefficient
The coefficient used to calculate the doublet rate. Default is
0.01. Doublet rate is calculated asncol(srt) / 1000 * db_coefficient.- decontX_threshold
Optional contamination threshold used to filter cells after running
RunDecontX(). Cells withdecontX_contaminationgreater than this value are marked as failed indecontX_qc. Default isNULL, which computes decontX results without filtering cells by contamination.- decontX_batch
Batch labels passed to
RunDecontX()when"decontX"is included inqc_metrics. Default isNULL.- decontX_background
Optional background / empty-droplet input passed to
RunDecontX()when"decontX"is included inqc_metrics. Default isNULL.- decontX_background_assay
Assay name used when
decontX_backgroundis aSeuratobject orSingleCellExperiment. Default isNULL.- decontX_bg_batch
Batch labels for
decontX_backgroundpassed toRunDecontX(). Default isNULL.- decontX_assay_name
Name of the assay used to store decontaminated counts from
RunDecontX(). Default is"decontXcounts".- decontX_store_assay
Whether to store decontaminated counts as a new assay when running
RunDecontX(). Default isFALSE.- decontX_round_counts
Whether to round decontaminated counts before creating the assay in
RunDecontX(). Default isTRUE.- decontX_args
A named list of additional advanced arguments passed to
RunDecontX()when"decontX"is included inqc_metrics. ExplicitdecontX_*parameters are preferred for common options and take precedence when both are supplied. Default islist().- atac_args
A named list of additional arguments passed to
RunATACQC()when"atac"is included inqc_metrics. Threshold arguments fromRunATACQC()are used to label failed cells inatac_qc, but filtering is deferred toRunCellQC(). Default islist().- outlier_threshold
A character vector specifying the outlier threshold. Default is
c("log10_nCount:lower:2.5", "log10_nCount:higher:5", "log10_nFeature:lower:2.5", "log10_nFeature:higher:5", "featurecount_dist:lower:2.5").- outlier_n
Minimum number of outlier metrics that meet the conditions for determining outlier cells. Default is
1.- UMI_threshold
UMI number threshold. Cells that exceed this threshold will be considered as kept. Default is
3000.- gene_threshold
Gene number threshold. Cells that exceed this threshold will be considered as kept. Default is
1000.- mito_threshold
Percentage of UMI counts of mitochondrial genes. Cells that exceed this threshold will be considered as discarded. Default is
20.- mito_pattern
Regex patterns to match the mitochondrial genes. Default is
c("MT-", "Mt-", "mt-").- mito_gene
A defined mitochondrial genes. If features provided, will ignore the
mito_patternmatching. Default isNULL.- ribo_threshold
Percentage of UMI counts of ribosomal genes. Cells that exceed this threshold will be considered as discarded. Default is
50.- ribo_pattern
Regex patterns to match the ribosomal genes. Default is
c("RP[SL]\\d+\\w{0,1}\\d*$", "Rp[sl]\\d+\\w{0,1}\\d*$", "rp[sl]\\d+\\w{0,1}\\d*$").- ribo_gene
A defined ribosomal genes. If features provided, will ignore the
ribo_patternmatching. Default isNULL.- ribo_mito_ratio_range
A numeric vector specifying the range of ribosomal/mitochondrial gene expression ratios for ribo_mito_ratio outlier cells. Default is
c(1, Inf).- species
Species used as the suffix of the QC metrics. The first is the species of interest. Default is
NULL.- species_gene_prefix
Species gene prefix used to calculate QC metrics for each species. Default is
NULL.- species_percent
Percentage of UMI counts of the first species. Cells that exceed this threshold will be considered as kept. Default is
95.- seed
Random seed for reproducibility. Default is
11.
Examples
data(pancreas_sub)
pancreas_sub <- standard_scop(pancreas_sub)
#> ℹ [2026-05-14 06:56:19] Start standard processing workflow...
#> ℹ [2026-05-14 06:56:20] Checking a list of <Seurat>...
#> ! [2026-05-14 06:56:20] Data 1/1 of the `srt_list` is "unknown"
#> ℹ [2026-05-14 06:56:20] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on 1/1 of `srt_list`...
#> ℹ [2026-05-14 06:56:22] Perform `Seurat::FindVariableFeatures()` on 1/1 of `srt_list`...
#> ℹ [2026-05-14 06:56:22] Use the separate HVF from `srt_list`
#> ℹ [2026-05-14 06:56:22] Number of available HVF: 2000
#> ℹ [2026-05-14 06:56:22] Finished check
#> ℹ [2026-05-14 06:56:22] Perform `Seurat::ScaleData()`
#> ℹ [2026-05-14 06:56:23] Perform pca linear dimension reduction
#> ℹ [2026-05-14 06:56:23] Use stored estimated dimensions 1:20 for Standardpca
#> ℹ [2026-05-14 06:56:24] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#> ℹ [2026-05-14 06:56:24] Reorder clusters...
#> ℹ [2026-05-14 06:56:24] Skip `log1p()` because `layer = data` is not "counts"
#> ℹ [2026-05-14 06:56:24] Perform umap nonlinear dimension reduction
#> ℹ [2026-05-14 06:56:24] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#> ℹ [2026-05-14 06:56:29] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#> ✔ [2026-05-14 06:56:35] Standard processing workflow completed
pancreas_sub <- RunCellQC(
pancreas_sub,
db_method = "scds_cxds"
)
#> ◌ [2026-05-14 06:56:35] Running cell-level quality control
#> ℹ [2026-05-14 06:56:35] Data type is raw counts
#> ℹ [2026-05-14 06:56:35] Running scds with method "cxds"
#> Registered S3 method overwritten by 'pROC':
#> method from
#> plot.roc spatstat.explore
#> ! [2026-05-14 06:57:26] Skip "atac" QC because `assay = 'RNA'` is not a <ChromatinAssay>
#> ℹ [2026-05-14 06:57:26] Running decontX
#> Warning: 'librarySizeFactors' is deprecated.
#> Use 'scrapper::centerSizeFactors' instead.
#> See help("Deprecated")
#> Warning: 'normalizeCounts' is deprecated.
#> Use 'scrapper::normalizeCounts' instead.
#> See help("Deprecated")
#> ℹ [2026-05-14 07:00:41] decontX contamination (median/mean/max): 0.0136 / 0.1628 / 0.7465
#> ℹ [2026-05-14 07:00:41] decontX assay stored as decontXcounts
#> ✔ [2026-05-14 07:00:41] decontX decontamination completed
#> ✔ [2026-05-14 07:00:41] ● Total cells: 1000
#> ✔ ◉ 967 cells remained
#> ✔ ◯ 33 cells filtered out:
#> ✔ ◯ 10 potential doublets
#> ✔ ◯ 0 ATAC QC failed cells
#> ✔ ◯ 0 high-contamination cells
#> ✔ ◯ 23 outlier cells
#> ✔ ◯ 0 low-UMI cells
#> ✔ ◯ 0 low-gene cells
#> ✔ ◯ 0 high-mito cells
#> ✔ ◯ 0 high-ribo cells
#> ✔ ◯ 0 ribo_mito_ratio outlier cells
#> ✔ ◯ 0 species-contaminated cells
CellStatPlot(
pancreas_sub,
stat.by = c(
"db_qc", "outlier_qc"
),
plot_type = "upset",
stat_level = "Fail"
)
#> ! [2026-05-14 07:00:42] `stat_type` is forcibly set to "count" when plot "sankey", "chord", "venn", and "upset"
#> `geom_line()`: Each group consists of only one observation.
#> ℹ Do you need to adjust the group aesthetic?
#> `geom_line()`: Each group consists of only one observation.
#> ℹ Do you need to adjust the group aesthetic?