Skip to contents

This function performs a standard single-cell analysis workflow.

Usage

standard_scop(
  srt,
  prefix = "Standard",
  assay = NULL,
  do_normalization = NULL,
  normalization_method = "LogNormalize",
  do_HVF_finding = TRUE,
  HVF_method = "vst",
  nHVF = 2000,
  HVF = NULL,
  do_scaling = TRUE,
  vars_to_regress = NULL,
  regression_model = "linear",
  linear_reduction = "pca",
  linear_reduction_dims = 50,
  linear_reduction_dims_use = NULL,
  linear_reduction_params = list(),
  force_linear_reduction = FALSE,
  nonlinear_reduction = "umap",
  nonlinear_reduction_dims = c(2, 3),
  nonlinear_reduction_params = list(),
  force_nonlinear_reduction = TRUE,
  neighbor_metric = "euclidean",
  neighbor_k = 20L,
  cluster_algorithm = "louvain",
  cluster_resolution = 0.6,
  verbose = TRUE,
  seed = 11
)

Arguments

srt

A Seurat object.

prefix

A prefix to add to the names of intermediate objects created by the function. Default is "Standard".

assay

Which assay to use. If NULL, the default assay of the Seurat object will be used. When the object also contains ChromatinAssay, the default assay and additional ChromatinAssay will be preprocessed sequentially.

do_normalization

Whether to perform normalization. If NULL, normalization will be performed if the specified assay does not have scaled data.

normalization_method

The method to use for normalization. Options are "LogNormalize", "SCT", or "TFIDF". Default is "LogNormalize".

do_HVF_finding

Whether to perform high variable feature finding. If TRUE, the function will force to find the highly variable features (HVF) using the specified HVF method.

HVF_method

The method to use for finding highly variable features. Options are "vst", "mvp", or "disp". Default is "vst".

nHVF

The number of highly variable features to select. If NULL, all highly variable features will be used. Default is 2000.

HVF

A vector of feature names to use as highly variable features. If NULL, the function will use the highly variable features identified by the HVF method.

do_scaling

Whether to perform scaling. If TRUE, the function will force to scale the data using the Seurat::ScaleData function.

vars_to_regress

A vector of feature names to use as regressors in the scaling step. If NULL, no regressors will be used.

regression_model

The regression model to use for scaling. Options are "linear", "poisson", or "negativebinomial". Default is "linear".

linear_reduction

The linear dimensionality reduction method to use. Options are "pca", "svd", "ica", "nmf", "mds", or "glmpca". Default is "pca".

linear_reduction_dims

The number of dimensions to keep after linear dimensionality reduction. Default is 50.

linear_reduction_dims_use

The dimensions to use for downstream analysis. If NULL, estimated dimensions stored in the linear reduction will be used when available; otherwise, the first up to 50 dimensions will be used as a fallback.

linear_reduction_params

A list of parameters to pass to the linear dimensionality reduction method.

force_linear_reduction

Whether to force linear dimensionality reduction even if the specified reduction is already present in the Seurat object.

nonlinear_reduction

The nonlinear dimensionality reduction method to use. Options are "umap", "umap-naive", "tsne", "dm", "phate", "pacmap", "trimap", "largevis", or "fr". Default is "umap".

nonlinear_reduction_dims

The number of dimensions to keep after nonlinear dimensionality reduction. If a vector is provided, different numbers of dimensions can be specified for each method. Default is c(2, 3).

nonlinear_reduction_params

A list of parameters to pass to the nonlinear dimensionality reduction method.

force_nonlinear_reduction

Whether to force nonlinear dimensionality reduction even if the specified reduction is already present in the Seurat object. Default is TRUE.

neighbor_metric

The distance metric to use for finding neighbors. Options are "euclidean", "cosine", "manhattan", or "hamming". Default is "euclidean".

neighbor_k

The number of nearest neighbors to use for finding neighbors. Default is 20.

cluster_algorithm

The clustering algorithm to use. Options are "louvain", "slm", or "leiden". Default is "louvain".

cluster_resolution

The resolution parameter to use for clustering. Larger values result in fewer clusters. Default is 0.6.

verbose

Whether to print the message. Default is TRUE.

seed

Random seed for reproducibility. Default is 11.

Value

A Seurat object.

Examples

library(Matrix)
data(pancreas_sub)
pancreas_sub <- standard_scop(pancreas_sub)
#>  [2026-04-26 02:39:23] Start standard processing workflow...
#>  [2026-04-26 02:39:24] Checking a list of <Seurat>...
#> ! [2026-04-26 02:39:24] Data 1/1 of the `srt_list` is "unknown"
#>  [2026-04-26 02:39:24] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on 1/1 of `srt_list`...
#>  [2026-04-26 02:39:26] Perform `Seurat::FindVariableFeatures()` on 1/1 of `srt_list`...
#>  [2026-04-26 02:39:27] Use the separate HVF from `srt_list`
#>  [2026-04-26 02:39:27] Number of available HVF: 2000
#>  [2026-04-26 02:39:27] Finished check
#>  [2026-04-26 02:39:27] Perform `Seurat::ScaleData()`
#>  [2026-04-26 02:39:28] Perform pca linear dimension reduction
#>  [2026-04-26 02:39:28] Use stored estimated dimensions 1:20 for Standardpca
#>  [2026-04-26 02:39:29] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#>  [2026-04-26 02:39:29] Reorder clusters...
#>  [2026-04-26 02:39:29] Skip `log1p()` because `layer = data` is not "counts"
#>  [2026-04-26 02:39:29] Perform umap nonlinear dimension reduction
#>  [2026-04-26 02:39:29] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:39:35] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:39:41] Standard processing workflow completed
CellDimPlot(
  pancreas_sub,
  group.by = "SubCellType"
)


# Use a combination of different linear
# or nonlinear dimension reduction methods
linear_reductions <- c(
  "pca", "nmf", "mds"
)
pancreas_sub <- standard_scop(
  pancreas_sub,
  linear_reduction = linear_reductions,
  nonlinear_reduction = "umap"
)
#>  [2026-04-26 02:39:41] Start standard processing workflow...
#>  [2026-04-26 02:39:41] Checking a list of <Seurat>...
#>  [2026-04-26 02:39:42] Data 1/1 of the `srt_list` has been log-normalized
#>  [2026-04-26 02:39:42] Perform `Seurat::FindVariableFeatures()` on 1/1 of `srt_list`...
#>  [2026-04-26 02:39:42] Use the separate HVF from `srt_list`
#>  [2026-04-26 02:39:42] Number of available HVF: 2000
#>  [2026-04-26 02:39:42] Finished check
#>  [2026-04-26 02:39:42] Perform `Seurat::ScaleData()`
#>  [2026-04-26 02:39:43] Perform pca linear dimension reduction
#>  [2026-04-26 02:39:43] Use stored estimated dimensions 1:20 for Standardpca
#>  [2026-04-26 02:39:44] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#>  [2026-04-26 02:39:44] Reorder clusters...
#>  [2026-04-26 02:39:44] Skip `log1p()` because `layer = data` is not "counts"
#>  [2026-04-26 02:39:44] Perform umap nonlinear dimension reduction
#>  [2026-04-26 02:39:44] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#> Warning: Key ‘StandardpcaUMAP2D_’ taken, using ‘standardpcaumap2d_’ instead
#>  [2026-04-26 02:39:50] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#> Warning: Key ‘StandardpcaUMAP3D_’ taken, using ‘standardpcaumap3d_’ instead
#>  [2026-04-26 02:39:56] Perform nmf linear dimension reduction
#>  [2026-04-26 02:39:56] Running NMF...
#>  StandardBE_ 1 
#>  Positive:  Ccnd1, Spp1, Mdk, Rps2, Ldha, Pebp1, Cd24a, Dlk1, Krt8, Mgst1 
#>  	   Clu, Gapdh, Eno1, Prdx1, Cldn10, Mif, Cldn7, Npm1, Dbi, Vim 
#>  	   Sox9, Rpl12, Aldh1b1, Rplp1, Wfdc2, Krt18, Tkt, Aldoa, Hspe1, Ptma 
#>  Negative:  Tmem108, Poc1a, Epn3, Wipi1, Tmcc3, Nhsl1, Fgf12, Plekho1, Tecpr2, Zbtb4 
#>  	   Gm10941, Trf, Man1c1, Hmgcs1, Nipal1, Jam3, Pgap1, Alpl, Kcnip3, Tnr 
#>  	   Gm15915, Rbp2, Cbfa2t2, Sh2d4a, Bbc3, Megf6, Naaladl2, Fam46d, Hist2h2ac, Tox2 
#>  StandardBE_ 2 
#>  Positive:  Spp1, Gsta3, Sparc, Vim, Atp1b1, Mt1, Dbi, Anxa2, Rps2, Id2 
#>  	   Rpl22l1, Rplp1, Mgst1, Clu, Sox9, Cldn6, Mdk, Pdzk1ip1, Bicc1, 1700011H14Rik 
#>  	   Rps12, S100a10, Cldn3, Rpl36a, Ppp1r1b, Adamts1, Serpinh1, Mt2, Ifitm2, Rpl39 
#>  Negative:  Rpa3, Aacs, Tmem108, Poc1a, Epn3, Wipi1, B830012L14Rik, Tmcc3, Wsb1, Plekho1 
#>  	   Ppp2r2b, Tecpr2, Zbtb4, Haus8, Trf, Gm5420, Man1c1, Hmgcs1, Nipal1, Jam3 
#>  	   Tcerg1, Pgap1, Snrpa1, Alpl, Larp1b, Kcnip3, Tnr, Lsm12, Ptbp3, Gm15915 
#>  StandardBE_ 3 
#>  Positive:  Cck, Mdk, Gadd45a, Neurog3, Selm, Sox4, Btbd17, Tmsb4x, Btg2, Cldn6 
#>  	   Cotl1, Ptma, Jun, Ppp1r14a, Rps2, Ifitm2, Neurod2, Igfbpl1, Gnas, Krt7 
#>  	   Nkx6-1, Aplp1, Ppp3ca, Lrpap1, Rplp1, Hn1, Rps12, Mfng, BC023829, Smarcd2 
#>  Negative:  Elovl6, Tmem108, Poc1a, Epn3, Nop56, Wipi1, B830012L14Rik, Rrp15, Rfc1, Fgf12 
#>  	   Slc20a1, Ppp2r2b, Lama1, Tecpr2, Zbtb4, Eif1ax, Fam162a, P4ha3, Gm10941, Tenm4 
#>  	   Pde4b, Gm5420, Man1c1, Hmgcs1, Pgap1, Mgst2, Larp1b, Kcnip3, Tnr, Lsm12 
#>  StandardBE_ 4 
#>  Positive:  Spp1, Cyr61, Krt18, Tpm1, Krt8, Myl12a, Vim, Jun, Anxa5, Tnfrsf12a 
#>  	   Csrp1, Sparc, Cldn7, Nudt19, Anxa2, Clu, Myl9, Atp1b1, Cldn3, Tagln2 
#>  	   S100a10, 1700011H14Rik, Cd24a, Rps2, Dbi, Id2, Lurap1l, Rplp1, Myl12b, Klf6 
#>  Negative:  Rpa3, Elovl6, Aacs, Tmem108, Poc1a, Tmcc3, Rfc1, Plekho1, Slc20a1, Ppp2r2b 
#>  	   Lama1, Tecpr2, Gm10941, Tenm4, Pde4b, Man1c1, Nipal1, Jam3, Pgap1, Alpl 
#>  	   Mgst2, Kcnip3, Tnr, Ptbp3, Gm15915, Cntln, Ocln, Fras1, Rbp2, Cbfa2t2 
#>  StandardBE_ 5 
#>  Positive:  2810417H13Rik, Rrm2, Hmgb2, Dut, Pcna, Lig1, H2afz, Tipin, Tuba1b, Tk1 
#>  	   Mcm5, Dek, Tyms, Gmnn, Ran, Tubb5, Rfc2, Srsf2, Ranbp1, Orc6 
#>  	   Mcm3, Uhrf1, Gins2, Dnajc9, Mcm6, Siva1, Rfc3, Mcm7, Rpa2, Ptma 
#>  Negative:  1110002L01Rik, Aacs, Wipi1, B830012L14Rik, Tmcc3, Trib1, Fgf12, Plekho1, Ppp2r2b, Lama1 
#>  	   Tenm4, Trf, Gm5420, Man1c1, Jam3, Mgst2, Kcnip3, Tnr, Gm15915, Cbfa2t2 
#>  	   Sh2d4a, Bbc3, Fkbp9, Ano6, Prkcb, Megf6, Fam46d, Slc52a3, Ankrd2, Tox2 
#>  [2026-04-26 02:40:00] NMF compute completed
#>  [2026-04-26 02:40:00] Use stored estimated dimensions 1:50 for Standardnmf
#>  [2026-04-26 02:40:00] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#>  [2026-04-26 02:40:00] Reorder clusters...
#>  [2026-04-26 02:40:00] Skip `log1p()` because `layer = data` is not "counts"
#>  [2026-04-26 02:40:00] Perform umap nonlinear dimension reduction
#>  [2026-04-26 02:40:00] Perform umap nonlinear dimension reduction using Standardnmf (1:50)
#>  [2026-04-26 02:40:06] Perform umap nonlinear dimension reduction using Standardnmf (1:50)
#>  [2026-04-26 02:40:12] Perform mds linear dimension reduction
#>  [2026-04-26 02:40:13] Use stored estimated dimensions 1:20 for Standardmds
#>  [2026-04-26 02:40:14] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#>  [2026-04-26 02:40:14] Reorder clusters...
#>  [2026-04-26 02:40:14] Skip `log1p()` because `layer = data` is not "counts"
#>  [2026-04-26 02:40:14] Perform umap nonlinear dimension reduction
#>  [2026-04-26 02:40:14] Perform umap nonlinear dimension reduction using Standardmds (1:20)
#>  [2026-04-26 02:40:20] Perform umap nonlinear dimension reduction using Standardmds (1:20)
#>  [2026-04-26 02:40:25] Standard processing workflow completed
plist1 <- lapply(
  linear_reductions, function(lr) {
    CellDimPlot(
      pancreas_sub,
      group.by = "SubCellType",
      reduction = paste0(
        "Standard", lr, "UMAP2D"
      ),
      xlab = "", ylab = "",
      title = paste0(lr, "_umap"),
      legend.position = "none",
      theme_use = "theme_blank"
    )
  }
)
patchwork::wrap_plots(plist1)


nonlinear_reductions <- c(
  "umap", "tsne", "fr"
)
pancreas_sub <- standard_scop(
  pancreas_sub,
  linear_reduction = "pca",
  nonlinear_reduction = nonlinear_reductions
)
#>  [2026-04-26 02:40:26] Start standard processing workflow...
#>  [2026-04-26 02:40:26] Checking a list of <Seurat>...
#>  [2026-04-26 02:40:26] Data 1/1 of the `srt_list` has been log-normalized
#>  [2026-04-26 02:40:26] Perform `Seurat::FindVariableFeatures()` on 1/1 of `srt_list`...
#>  [2026-04-26 02:40:27] Use the separate HVF from `srt_list`
#>  [2026-04-26 02:40:27] Number of available HVF: 2000
#>  [2026-04-26 02:40:27] Finished check
#>  [2026-04-26 02:40:27] Perform `Seurat::ScaleData()`
#>  [2026-04-26 02:40:28] Perform pca linear dimension reduction
#>  [2026-04-26 02:40:28] Use stored estimated dimensions 1:20 for Standardpca
#>  [2026-04-26 02:40:29] Perform `Seurat::FindClusters()` with `cluster_algorithm = 'louvain'` and `cluster_resolution = 0.6`
#>  [2026-04-26 02:40:29] Reorder clusters...
#>  [2026-04-26 02:40:29] Skip `log1p()` because `layer = data` is not "counts"
#>  [2026-04-26 02:40:29] Perform umap nonlinear dimension reduction
#>  [2026-04-26 02:40:29] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:40:35] Perform umap nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:40:40] Perform tsne nonlinear dimension reduction
#>  [2026-04-26 02:40:41] Perform tsne nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:40:42] Perform tsne nonlinear dimension reduction using Standardpca (1:20)
#>  [2026-04-26 02:40:45] Perform fr nonlinear dimension reduction
#>  [2026-04-26 02:40:45] Perform fr nonlinear dimension reduction using Standardpca_SNN
#>  [2026-04-26 02:40:47] Perform fr nonlinear dimension reduction using Standardpca_SNN
#>  [2026-04-26 02:40:48] Standard processing workflow completed
plist2 <- lapply(
  nonlinear_reductions, function(nr) {
    CellDimPlot(
      pancreas_sub,
      group.by = "SubCellType",
      reduction = paste0(
        "Standardpca", nr, "2D"
      ),
      xlab = "", ylab = "",
      title = paste0("pca_", nr),
      legend.position = "none",
      theme_use = "theme_blank"
    )
  }
)
patchwork::wrap_plots(plist2)