This function obtains enhancers and promoters of each gene, and calculates the bulk average Hi-C matrix of contacts between the gene and its enhancers.
Usage
calculateHiCWeights(
SCEGdata,
species,
genome,
focus_gene,
averHicPath,
TSSwindow = 1000,
upstream = 250000,
downstream = 250000,
verbose = TRUE
)Arguments
- SCEGdata
Preprocessed data for SCEG-HiC.
- species
Character string specifying the species name. Supported values are "Homo sapiens" or "Mus musculus".
- genome
Character string specifying the genome assembly. Supported values are "hg38", "hg19", "mm10", or "mm9".
- focus_gene
Character vector of gene names to focus on.
- averHicPath
Path to the bulk average Hi-C data.
- TSSwindow
Numeric specifying the number of base pairs to extend upstream and downstream around each TSS to define promoters. Default is 1000 bp (total window size of 2000 bp).
- upstream
Numeric specifying the number of base pairs upstream of each TSS to define enhancers. Default is 250,000 bp (250 kb).
- downstream
Numeric specifying the number of base pairs downstream of each TSS to define enhancers. Default is 250,000 bp (250 kb).
- verbose
Logical. Should progress messages and warnings be printed?
Value
A named list where each element corresponds to a focus gene and contains:
promoters: data.frame of promoter regions.enhancers: data.frame of enhancer regions.contact: bulk average Hi-C contact matrix between the gene and enhancers.
Examples
data(multiomic_small)
SCEGdata <- process_data(multiomic_small, k_neigh = 5, max_overlap = 0.5)
#> Generating aggregated data
#> Aggregating cluster 0
#> Sample cells randomly.
#> There are 11 samples
#> Aggregating cluster 1
#> Sample cells randomly.
#> There are 11 samples
fpath <- system.file("extdata", package = "SCEGHiC")
gene <- c("TRABD2A", "GNLY", "MFSD6", "CTLA4", "LCLAT1", "NCK2", "GALM", "TMSB10", "ID2", "CXCR4")
weight <- calculateHiCWeights(SCEGdata, species = "Homo sapiens", genome = "hg38", focus_gene = gene, averHicPath = fpath)
#> Processing chromosome chr2...
#> Found 10 TSS loci on chr2.
#> Calculating Hi-C weights for gene TRABD2A...
#> Calculating Hi-C weights for gene GNLY...
#> Calculating Hi-C weights for gene MFSD6...
#> Calculating Hi-C weights for gene CXCR4...
#> Calculating Hi-C weights for gene CTLA4...
#> Calculating Hi-C weights for gene LCLAT1...
#> Calculating Hi-C weights for gene NCK2...
#> Calculating Hi-C weights for gene ID2...
#> Calculating Hi-C weights for gene GALM...
#> Calculating Hi-C weights for gene TMSB10...
#> Finished calculating Hi-C weights for all genes.
weight[["CTLA4"]]$promoters
#> character(0)
weight[["CTLA4"]]$enhancers
#> [1] "chr2_203623664_203623982" "chr2_203682042_203682957"
#> [3] "chr2_203730093_203731243" "chr2_203786816_203787404"
#> [5] "chr2_203830381_203830563" "chr2_203932737_203935679"
#> [7] "chr2_203945543_203945939" "chr2_203992800_203993979"
#> [9] "chr2_204109915_204110871"
head(weight[["CTLA4"]]$contact)
#> element1 element2 score
#> <char> <char> <num>
#> 1: CTLA4 chr2_203623664_203623982 0.0009179376
#> 2: CTLA4 chr2_203682042_203682957 0.0007176351
#> 3: CTLA4 chr2_203730093_203731243 0.0013151936
#> 4: CTLA4 chr2_203786816_203787404 0.0013789783
#> 5: CTLA4 chr2_203830381_203830563 0.0038276615
#> 6: CTLA4 chr2_203932737_203935679 0.0019456916