Create aggregated data for a certain cluster
Source:R/preprocessing.R
generate_aggregated_datasets.RdFunction to generate aggregated inputs of a cetrain cluster. generate_aggregated_datasets
takes as input sparse data. This function will aggregate binary accessibility scores (or gene expression)
per cell cluster, if they do not overlap any existing group with more than 50% cells.
Usage
generate_aggregated_datasets(
object,
cell_coord,
rna_assay = "RNA",
atac_assay = "peaks",
k_neigh = 50,
atacbinary = TRUE,
max_overlap = 0.8,
seed = 123,
verbose = TRUE
)Arguments
- object
A Seurat object.
- cell_coord
A similarity matrix or dimensionality reduction (e.g., PCA, UMAP) used for identifying neighbors.
- rna_assay
Character. Name of the assay containing gene expression data.
- atac_assay
Character. Name of the assay containing peak (chromatin accessibility) data.
- k_neigh
Integer. Number of neighboring cells to aggregate per group (default is 50).
- atacbinary
Logical. Should the aggregated scATAC-seq matrix be binarized?
- max_overlap
Numeric. Maximum allowed overlap ratio between two aggregated groups (default is 0.8).
- seed
Integer. Random seed.
- verbose
Logical. Logical. Should progress messages and warnings be printed?