Skip to contents

Function to generate aggregated inputs of a cetrain cluster. generate_aggregated_datasets takes as input sparse data. This function will aggregate binary accessibility scores (or gene expression) per cell cluster, if they do not overlap any existing group with more than 50% cells.

Usage

generate_aggregated_datasets(
  object,
  cell_coord,
  rna_assay = "RNA",
  atac_assay = "peaks",
  k_neigh = 50,
  atacbinary = TRUE,
  max_overlap = 0.8,
  seed = 123,
  verbose = TRUE
)

Arguments

object

A Seurat object.

cell_coord

A similarity matrix or dimensionality reduction (e.g., PCA, UMAP) used for identifying neighbors.

rna_assay

Character. Name of the assay containing gene expression data.

atac_assay

Character. Name of the assay containing peak (chromatin accessibility) data.

k_neigh

Integer. Number of neighboring cells to aggregate per group (default is 50).

atacbinary

Logical. Should the aggregated scATAC-seq matrix be binarized?

max_overlap

Numeric. Maximum allowed overlap ratio between two aggregated groups (default is 0.8).

seed

Integer. Random seed.

verbose

Logical. Logical. Should progress messages and warnings be printed?

Value

A matrix or sparse matrix containing aggregated accessibility or expression values.