Skip to contents

Annotate single cells using scmap.

Usage

RunScmap(
  srt_query,
  srt_ref,
  ref_group = NULL,
  query_assay = "RNA",
  ref_assay = "RNA",
  method = "scmapCluster",
  nfeatures = 500,
  threshold = 0.5,
  k = 10
)

Arguments

srt_query

An object of class Seurat to be annotated with cell types.

srt_ref

An object of class Seurat storing the reference cells.

ref_group

A character vector specifying the column name in the `srt_ref` metadata that represents the cell grouping.

query_assay

A character vector specifying the assay to be used for the query data. Defaults to the default assay of the `srt_query` object.

ref_assay

A character vector specifying the assay to be used for the reference data. Defaults to the default assay of the `srt_ref` object.

method

The method to be used for scmap analysis. Can be any of "scmapCluster" or "scmapCell". The default value is "scmapCluster".

nfeatures

The number of top features to be selected. The default value is 500.

threshold

The threshold value on similarity to determine if a cell is assigned to a cluster. This should be a value between 0 and 1. The default value is 0.5.

k

Number of clusters per group for k-means clustering when method is "scmapCell".

Examples

data(panc8_sub)

genenames <- make.unique(
  thisutils::capitalize(
    rownames(panc8_sub),
    force_tolower = TRUE
  )
)
names(genenames) <- rownames(panc8_sub)
panc8_sub <- RenameFeatures(
  panc8_sub,
  newnames = genenames
)
#>  [2025-09-20 13:48:12] Rename features for the assay: RNA
panc8_sub <- CheckDataMerge(
  panc8_sub,
  batch = "tech"
)[["srt_merge"]]
#>  [2025-09-20 13:48:12] Spliting `srt_merge` into `srt_list` by column "tech"...
#>  [2025-09-20 13:48:12] Checking a list of <Seurat> object...
#> ! [2025-09-20 13:48:12] Data 1/5 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:12] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 1/5 of the `srt_list`...
#>  [2025-09-20 13:48:14] Perform `Seurat::FindVariableFeatures()` on the data 1/5 of the `srt_list`...
#> ! [2025-09-20 13:48:14] Data 2/5 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:14] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 2/5 of the `srt_list`...
#>  [2025-09-20 13:48:16] Perform `Seurat::FindVariableFeatures()` on the data 2/5 of the `srt_list`...
#> ! [2025-09-20 13:48:16] Data 3/5 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:16] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 3/5 of the `srt_list`...
#>  [2025-09-20 13:48:18] Perform `Seurat::FindVariableFeatures()` on the data 3/5 of the `srt_list`...
#> ! [2025-09-20 13:48:18] Data 4/5 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:18] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 4/5 of the `srt_list`...
#>  [2025-09-20 13:48:20] Perform `Seurat::FindVariableFeatures()` on the data 4/5 of the `srt_list`...
#> ! [2025-09-20 13:48:20] Data 5/5 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:21] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 5/5 of the `srt_list`...
#>  [2025-09-20 13:48:22] Perform `Seurat::FindVariableFeatures()` on the data 5/5 of the `srt_list`...
#>  [2025-09-20 13:48:23] Use the separate HVF from srt_list
#>  [2025-09-20 13:48:23] Number of available HVF: 2000
#>  [2025-09-20 13:48:23] Finished check

data(pancreas_sub)
pancreas_sub <- standard_scop(pancreas_sub)
#>  [2025-09-20 13:48:26] Start standard scop workflow...
#>  [2025-09-20 13:48:26] Checking a list of <Seurat> object...
#> ! [2025-09-20 13:48:26] Data 1/1 of the `srt_list` is "unknown"
#>  [2025-09-20 13:48:26] Perform `NormalizeData()` with `normalization.method = 'LogNormalize'` on the data 1/1 of the `srt_list`...
#>  [2025-09-20 13:48:29] Perform `Seurat::FindVariableFeatures()` on the data 1/1 of the `srt_list`...
#>  [2025-09-20 13:48:29] Use the separate HVF from srt_list
#>  [2025-09-20 13:48:29] Number of available HVF: 2000
#>  [2025-09-20 13:48:29] Finished check
#>  [2025-09-20 13:48:30] Perform `Seurat::ScaleData()`
#> Warning: Different features in new layer data than already exists for scale.data
#>  [2025-09-20 13:48:30] Perform pca linear dimension reduction
#> StandardPC_ 1 
#> Positive:  Aplp1, Cpe, Gnas, Fam183b, Map1b, Hmgn3, Pcsk1n, Chga, Tuba1a, Bex2 
#> 	   Syt13, Isl1, 1700086L19Rik, Pax6, Chgb, Scgn, Rbp4, Scg3, Gch1, Camk2n1 
#> 	   Cryba2, Pcsk2, Pyy, Tspan7, Mafb, Hist3h2ba, Dbpht2, Abcc8, Rap1b, Slc38a5 
#> Negative:  Spp1, Anxa2, Sparc, Dbi, 1700011H14Rik, Wfdc2, Gsta3, Adamts1, Clu, Mgst1 
#> 	   Bicc1, Ldha, Vim, Cldn3, Cyr61, Rps2, Mt1, Ptn, Phgdh, Nudt19 
#> 	   Smtnl2, Smco4, Habp2, Mt2, Col18a1, Rpl12, Galk1, Cldn10, Acot1, Ccnd1 
#> StandardPC_ 2 
#> Positive:  Rbp4, Tagln2, Tuba1b, Fkbp2, Pyy, Pcsk2, Iapp, Tmem27, Meis2, Tubb4b 
#> 	   Pcsk1n, Dbpht2, Rap1b, Dynll1, Tubb2a, Sdf2l1, Scgn, 1700086L19Rik, Scg2, Abcc8 
#> 	   Atp1b1, Hspa5, Fam183b, Papss2, Slc38a5, Scg3, Mageh1, Tspan7, Ppp1r1a, Ociad2 
#> Negative:  Neurog3, Btbd17, Gadd45a, Ppp1r14a, Neurod2, Sox4, Smarcd2, Mdk, Pax4, Btg2 
#> 	   Sult2b1, Hes6, Grasp, Igfbpl1, Gpx2, Cbfa2t3, Foxa3, Shf, Mfng, Tmsb4x 
#> 	   Amotl2, Gdpd1, Cdc14b, Epb42, Rcor2, Cotl1, Upk3bl, Rbfox3, Cldn6, Cer1 
#> StandardPC_ 3 
#> Positive:  Nusap1, Top2a, Birc5, Aurkb, Cdca8, Pbk, Mki67, Tpx2, Plk1, Ccnb1 
#> 	   2810417H13Rik, Incenp, Cenpf, Ccna2, Prc1, Racgap1, Cdk1, Aurka, Cdca3, Hmmr 
#> 	   Spc24, Kif23, Sgol1, Cenpe, Cdc20, Hist1h1b, Cdca2, Mxd3, Kif22, Ska1 
#> Negative:  Anxa5, Pdzk1ip1, Acot1, Tpm1, Anxa2, Dcdc2a, Capg, Sparc, Ttr, Pamr1 
#> 	   Clu, Cxcl12, Ndrg2, Hnf1aos1, Gas6, Gsta3, Krt18, Ces1d, Atp1b1, Muc1 
#> 	   Hhex, Acadm, Spp1, Enpp2, Bcl2l14, Sat1, Smtnl2, 1700011H14Rik, Tgm2, Fam159a 
#> StandardPC_ 4 
#> Positive:  Glud1, Tm4sf4, Akr1c19, Cldn4, Runx1t1, Fev, Pou3f4, Gm43861, Pgrmc1, Arx 
#> 	   Cd200, Lrpprc, Hmgn3, Ppp1r14c, Pam, Etv1, Tsc22d1, Slc25a5, Akap17b, Pgf 
#> 	   Fam43a, Emb, Jun, Krt8, Dnajc12, Mid1ip1, Ids, Rgs17, Uchl1, Alcam 
#> Negative:  Ins2, Ins1, Ppp1r1a, Nnat, Calr, Sytl4, Sdf2l1, Iapp, Pdia6, Mapt 
#> 	   G6pc2, C2cd4b, Npy, Gng12, P2ry1, Ero1lb, Adra2a, Papss2, Arhgap36, Fam151a 
#> 	   Dlk1, Creld2, Gip, Tmem215, Gm27033, Cntfr, Prss53, C2cd4a, Lyve1, Ociad2 
#> StandardPC_ 5 
#> Positive:  Pdx1, Nkx6-1, Npepl1, Cldn4, Cryba2, Fev, Jun, Chgb, Gng12, Adra2a 
#> 	   Mnx1, Sytl4, Pdk3, Gm27033, Nnat, Chga, Ins2, 1110012L19Rik, Enho, Krt7 
#> 	   Mlxipl, Tmsb10, Flrt1, Pax4, Tubb3, Prrg2, Gars, Frzb, BC023829, Gm2694 
#> Negative:  Irx2, Irx1, Gcg, Ctxn2, Tmem27, Ctsz, Tmsb15l, Nap1l5, Pou6f2, Gria2 
#> 	   Ghrl, Peg10, Smarca1, Arx, Lrpap1, Rgs4, Ttr, Gast, Tmsb15b2, Serpina1b 
#> 	   Slc16a10, Wnk3, Ly6e, Auts2, Sct, Arg1, Dusp10, Sphkap, Dock11, Edn3 
#>  [2025-09-20 13:48:31] Perform `Seurat::FindClusters()` with louvain and `cluster_resolution` = 0.6
#>  [2025-09-20 13:48:31] Reorder clusters...
#> ! [2025-09-20 13:48:31] Using `Seurat::AggregateExpression()` to calculate pseudo-bulk data for <Assay5>
#>  [2025-09-20 13:48:31] Perform umap nonlinear dimension reduction
#>  [2025-09-20 13:48:31] Non-linear dimensionality reduction (umap) using (Standardpca) dims (1-50) as input
#>  [2025-09-20 13:48:31] UMAP will return its model
#>  [2025-09-20 13:48:35] Non-linear dimensionality reduction (umap) using (Standardpca) dims (1-50) as input
#>  [2025-09-20 13:48:35] UMAP will return its model
#>  [2025-09-20 13:48:40] Run scop standard workflow done
pancreas_sub <- RunScmap(
  srt_query = pancreas_sub,
  srt_ref = panc8_sub,
  ref_group = "celltype",
  method = "scmapCluster"
)
#>  [2025-09-20 13:48:40] Installing: scmap...
#>  
#> → Will install 3 packages.
#> → All 3 packages (0 B) are cached.
#> + googleVis      0.7.3   
#> + randomForest   4.7-1.2 
#> + scmap          1.30.0  [bld][cmp]
#>  All system requirements are already installed.
#>   
#>  No downloads are needed, 3 pkgs are cached
#>  Got randomForest 4.7-1.2 (x86_64-pc-linux-gnu-ubuntu-24.04) (218.82 kB)
#>  Got googleVis 0.7.3 (x86_64-pc-linux-gnu-ubuntu-24.04) (498.39 kB)
#>  Got scmap 1.30.0 (source) (2.34 MB)
#>  Installing system requirements
#>  Executing `sudo sh -c apt-get -y update`
#> Get:1 file:/etc/apt/apt-mirrors.txt Mirrorlist [144 B]
#> Hit:2 http://azure.archive.ubuntu.com/ubuntu noble InRelease
#> Hit:3 http://azure.archive.ubuntu.com/ubuntu noble-updates InRelease
#> Hit:4 http://azure.archive.ubuntu.com/ubuntu noble-backports InRelease
#> Hit:5 http://azure.archive.ubuntu.com/ubuntu noble-security InRelease
#> Hit:6 https://packages.microsoft.com/repos/azure-cli noble InRelease
#> Hit:7 https://packages.microsoft.com/ubuntu/24.04/prod noble InRelease
#> Reading package lists...
#>  Executing `sudo sh -c apt-get -y install libcurl4-openssl-dev libssl-dev libicu-dev`
#> Reading package lists...
#> Building dependency tree...
#> Reading state information...
#> libcurl4-openssl-dev is already the newest version (8.5.0-2ubuntu10.6).
#> libssl-dev is already the newest version (3.0.13-0ubuntu3.5).
#> libicu-dev is already the newest version (74.2-1ubuntu3.1).
#> 0 upgraded, 0 newly installed, 0 to remove and 41 not upgraded.
#>  Installed googleVis 0.7.3  (32ms)
#>  Installed randomForest 4.7-1.2  (47ms)
#>  Building scmap 1.30.0
#>  Built scmap 1.30.0 (21.6s)
#>  Installed scmap 1.30.0  (1.1s)
#>  1 pkg + 63 deps: kept 60, added 3, dld 3 (3.06 MB) [26.9s]
#>  [2025-09-20 13:49:07] scmap installed successfully
#> Error in CheckDataType(data = GetAssayData5(srt_query, layer = "data",     assay = query_assay, verbose = FALSE)): argument "object" is missing, with no default
CellDimPlot(
  pancreas_sub,
  group.by = "scmap_annotation"
)
#> Error in CellDimPlot(pancreas_sub, group.by = "scmap_annotation"): Scmap_annotation is not in the meta.data of srt object.

pancreas_sub <- RunScmap(
  srt_query = pancreas_sub,
  srt_ref = panc8_sub,
  ref_group = "celltype",
  method = "scmapCell"
)
#>  [2025-09-20 13:49:07] scmap installed successfully
#> Error in CheckDataType(data = GetAssayData5(srt_query, layer = "data",     assay = query_assay, verbose = FALSE)): argument "object" is missing, with no default
CellDimPlot(
  pancreas_sub,
  group.by = "scmap_annotation"
)
#> Error in CellDimPlot(pancreas_sub, group.by = "scmap_annotation"): Scmap_annotation is not in the meta.data of srt object.