This function runs the SCEG-HiC model to infer gene-enhancer (gene-peak) links using preprocessed single-cell multi-omics data and bulk average Hi-C data as a prior. It calculates a partial correlation matrix via the weighted graphical lasso (wglasso) approach and selects gene-peak pairs accordingly.
Usage
Run_SCEG_HiC(
SCEGdata,
HiCWeights,
method = "wglasso",
focus_gene,
normalizeMethod = "1",
cutoff = NULL,
verbose = TRUE,
alpha = NULL
)Arguments
- SCEGdata
A list containing data preprocessed by
SCEG-HiC.- HiCWeights
A data frame of raw bulk average Hi-C interaction weights. Output of
calculateHiCWeights().- method
The method used for gene network reconstruction. Default is
"wglasso".- focus_gene
A character vector of gene symbols to focus on.
- normalizeMethod
Method used to normalize the average Hi-C scores. Default is "1".
"1": Normalize by rank scores."2": Normalize by -log10 transformation."3": Binarize the scores (values > 0 become 1; values ≤ 0 become 0)."4": Normalize by min-max scaling.
- cutoff
Threshold for selecting gene-peak pairs. Default is
NULL. Ifaggregate = TRUE, we recommend settingcutoff = 0.01. Ifaggregate = FALSE, we recommendcutoff = 0.001.- verbose
Logical. Should progress messages and warnings be printed?
- alpha
Multiplicative scaling factor for the penalty parameter (alpha * rho).
Examples
data(multiomic_small)
SCEGdata <- process_data(multiomic_small, k_neigh = 5, max_overlap = 0.5)
#> Generating aggregated data
#> Aggregating cluster 0
#> Sample cells randomly.
#> There are 11 samples
#> Aggregating cluster 1
#> Sample cells randomly.
#> There are 11 samples
fpath <- system.file("extdata", package = "SCEGHiC")
gene <- c("TRABD2A", "GNLY", "MFSD6", "CTLA4", "LCLAT1", "NCK2", "GALM", "TMSB10", "ID2", "CXCR4")
weight <- calculateHiCWeights(SCEGdata, species = "Homo sapiens", genome = "hg38", focus_gene = gene, averHicPath = fpath)
#> Processing chromosome chr2...
#> Found 10 TSS loci on chr2.
#> Calculating Hi-C weights for gene TRABD2A...
#> Calculating Hi-C weights for gene GNLY...
#> Calculating Hi-C weights for gene MFSD6...
#> Calculating Hi-C weights for gene CXCR4...
#> Calculating Hi-C weights for gene CTLA4...
#> Calculating Hi-C weights for gene LCLAT1...
#> Calculating Hi-C weights for gene NCK2...
#> Calculating Hi-C weights for gene ID2...
#> Calculating Hi-C weights for gene GALM...
#> Calculating Hi-C weights for gene TMSB10...
#> Finished calculating Hi-C weights for all genes.
results_SCEGHiC <- Run_SCEG_HiC(SCEGdata, weight, focus_gene = gene)
#> Total predicted genes: 10
#> Running model for gene: TRABD2A
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.43"
#> Running model for gene: GNLY
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.19"
#> Running model for gene: MFSD6
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.22"
#> Running model for gene: CXCR4
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.14"
#> Running model for gene: CTLA4
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.17"
#> Running model for gene: LCLAT1
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.41"
#> Running model for gene: NCK2
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.25"
#> Running model for gene: ID2
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.13"
#> Running model for gene: GALM
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.11"
#> Running model for gene: TMSB10
#> [1] "The optimal penalty parameter (rho) selected by BIC is: 0.44"
results_SCEGHiC[results_SCEGHiC$gene == "CTLA4", ]
#> gene peak score
#> chr2_203623664_203623982 CTLA4 chr2_203623664_203623982 0.05790189
#> chr2_203730093_203731243 CTLA4 chr2_203730093_203731243 0.25250722
#> chr2_203786816_203787404 CTLA4 chr2_203786816_203787404 0.00000000
#> chr2_203932737_203935679 CTLA4 chr2_203932737_203935679 -0.12877374
#> chr2_203945543_203945939 CTLA4 chr2_203945543_203945939 0.00000000
#> chr2_203992800_203993979 CTLA4 chr2_203992800_203993979 0.47307452
#> chr2_204109915_204110871 CTLA4 chr2_204109915_204110871 0.00000000