The CSS integration function
Usage
CSS_integrate(
srt_merge = NULL,
batch = NULL,
append = TRUE,
srt_list = NULL,
assay = NULL,
do_normalization = NULL,
normalization_method = "LogNormalize",
do_HVF_finding = TRUE,
HVF_source = "separate",
HVF_method = "vst",
nHVF = 2000,
HVF_min_intersection = 1,
HVF = NULL,
do_scaling = TRUE,
vars_to_regress = NULL,
regression_model = "linear",
scale_within_batch = FALSE,
linear_reduction = "pca",
linear_reduction_dims = 50,
linear_reduction_dims_use = NULL,
linear_reduction_params = list(),
force_linear_reduction = FALSE,
CSS_dims_use = NULL,
nonlinear_reduction = "umap",
nonlinear_reduction_dims = c(2, 3),
nonlinear_reduction_params = list(),
force_nonlinear_reduction = TRUE,
neighbor_metric = "euclidean",
neighbor_k = 20L,
cluster_algorithm = "louvain",
cluster_resolution = 0.6,
CSS_params = list(),
verbose = TRUE,
seed = 11
)Arguments
- srt_merge
A merged `Seurat` object that includes the batch information.
- batch
A character string specifying the batch variable name.
- append
Whether the integrated data will be appended to the original Seurat object (
srt_merge). Default isTRUE.- srt_list
A list of
Seuratobjects to be checked and preprocessed.- assay
Which assay to use. If
NULL, the default assay of the Seurat object will be used.- do_normalization
Whether data normalization should be performed. Default is
TRUE.- normalization_method
The normalization method to be used. Possible values are
"LogNormalize","SCT", and"TFIDF". Default is"LogNormalize".- do_HVF_finding
Whether to perform high variable feature finding. If
TRUE, the function will force to find the highly variable features (HVF) using the specified HVF method.- HVF_source
The source of highly variable features. Possible values are
"global"and"separate". Default is"separate".- HVF_method
The method to use for finding highly variable features. Options are
"vst","mvp", or"disp". Default is"vst".- nHVF
The number of highly variable features to select. If NULL, all highly variable features will be used. Default is
2000.- HVF_min_intersection
The feature needs to be present in batches for a minimum number of times in order to be considered as highly variable. Default is
1.- HVF
A vector of feature names to use as highly variable features. If NULL, the function will use the highly variable features identified by the HVF method.
- do_scaling
Whether to perform scaling. If
TRUE, the function will force to scale the data using the Seurat::ScaleData function.- vars_to_regress
A vector of variable names to include as additional regression variables. Default is
NULL.- regression_model
The regression model to use for scaling. Options are
"linear","poisson", or"negativebinomial". Default is"linear".- scale_within_batch
Whether to scale data within each batch. Only valid when the
integration_methodis one of"Uncorrected","Seurat","MNN","Harmony","BBKNN","CSS","ComBat".- linear_reduction
The linear dimensionality reduction method to use. Options are
"pca","svd","ica","nmf","mds", or"glmpca". Default is"pca".- linear_reduction_dims
The number of dimensions to keep after linear dimensionality reduction. Default is
50.- linear_reduction_dims_use
The dimensions to use for downstream analysis. If
NULL, all dimensions will be used.- linear_reduction_params
A list of parameters to pass to the linear dimensionality reduction method.
- force_linear_reduction
Whether to force linear dimensionality reduction even if the specified reduction is already present in the Seurat object.
- CSS_dims_use
A vector specifying the dimensions returned by CSS that will be utilized for downstream cell cluster finding and nonlinear reduction. If set to NULL, all the returned dimensions will be used by default.
- nonlinear_reduction
The nonlinear dimensionality reduction method to use. Options are
"umap","umap-naive","tsne","dm","phate","pacmap","trimap","largevis", or"fr". Default is"umap".- nonlinear_reduction_dims
The number of dimensions to keep after nonlinear dimensionality reduction. If a vector is provided, different numbers of dimensions can be specified for each method. Default is
c(2, 3).- nonlinear_reduction_params
A list of parameters to pass to the nonlinear dimensionality reduction method.
- force_nonlinear_reduction
Whether to force nonlinear dimensionality reduction even if the specified reduction is already present in the Seurat object. Default is
TRUE.- neighbor_metric
The distance metric to use for finding neighbors. Options are
"euclidean","cosine","manhattan", or"hamming". Default is"euclidean".- neighbor_k
The number of nearest neighbors to use for finding neighbors. Default is
20.- cluster_algorithm
The clustering algorithm to use. Options are
"louvain","slm", or"leiden". Default is"louvain".- cluster_resolution
The resolution parameter to use for clustering. Larger values result in fewer clusters. Default is
0.6.- CSS_params
A list of parameters for the [simspec::cluster_sim_spectrum](https://github.com/quadbio/simspec) function. Default is `list()`.
- verbose
Whether to print the message. Default is
TRUE.- seed
Random seed for reproducibility. Default is
11.