Enhancer_enrichment

Last updated: 2025-05-15

Checks: 7 0

Knit directory: ATAC_learning/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20231016)

The command set.seed(20231016) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: b62ef0b

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version b62ef0b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/H3K27ac_integration_noM.Rmd
    Ignored:    data/ACresp_SNP_table.csv
    Ignored:    data/ARR_SNP_table.csv
    Ignored:    data/All_merged_peaks.tsv
    Ignored:    data/CAD_gwas_dataframe.RDS
    Ignored:    data/CTX_SNP_table.csv
    Ignored:    data/Collapsed_expressed_NG_peak_table.csv
    Ignored:    data/DEG_toplist_sep_n45.RDS
    Ignored:    data/FRiP_first_run.txt
    Ignored:    data/Final_four_data/
    Ignored:    data/Frip_1_reads.csv
    Ignored:    data/Frip_2_reads.csv
    Ignored:    data/Frip_3_reads.csv
    Ignored:    data/Frip_4_reads.csv
    Ignored:    data/Frip_5_reads.csv
    Ignored:    data/Frip_6_reads.csv
    Ignored:    data/GO_KEGG_analysis/
    Ignored:    data/HF_SNP_table.csv
    Ignored:    data/Ind1_75DA24h_dedup_peaks.csv
    Ignored:    data/Ind1_TSS_peaks.RDS
    Ignored:    data/Ind1_firstfragment_files.txt
    Ignored:    data/Ind1_fragment_files.txt
    Ignored:    data/Ind1_peaks_list.RDS
    Ignored:    data/Ind1_summary.txt
    Ignored:    data/Ind2_TSS_peaks.RDS
    Ignored:    data/Ind2_fragment_files.txt
    Ignored:    data/Ind2_peaks_list.RDS
    Ignored:    data/Ind2_summary.txt
    Ignored:    data/Ind3_TSS_peaks.RDS
    Ignored:    data/Ind3_fragment_files.txt
    Ignored:    data/Ind3_peaks_list.RDS
    Ignored:    data/Ind3_summary.txt
    Ignored:    data/Ind4_79B24h_dedup_peaks.csv
    Ignored:    data/Ind4_TSS_peaks.RDS
    Ignored:    data/Ind4_V24h_fraglength.txt
    Ignored:    data/Ind4_fragment_files.txt
    Ignored:    data/Ind4_fragment_filesN.txt
    Ignored:    data/Ind4_peaks_list.RDS
    Ignored:    data/Ind4_summary.txt
    Ignored:    data/Ind5_TSS_peaks.RDS
    Ignored:    data/Ind5_fragment_files.txt
    Ignored:    data/Ind5_fragment_filesN.txt
    Ignored:    data/Ind5_peaks_list.RDS
    Ignored:    data/Ind5_summary.txt
    Ignored:    data/Ind6_TSS_peaks.RDS
    Ignored:    data/Ind6_fragment_files.txt
    Ignored:    data/Ind6_peaks_list.RDS
    Ignored:    data/Ind6_summary.txt
    Ignored:    data/Knowles_4.RDS
    Ignored:    data/Knowles_5.RDS
    Ignored:    data/Knowles_6.RDS
    Ignored:    data/LiSiLTDNRe_TE_df.RDS
    Ignored:    data/MI_gwas.RDS
    Ignored:    data/SNP_GWAS_PEAK_MRC_id
    Ignored:    data/SNP_GWAS_PEAK_MRC_id.csv
    Ignored:    data/SNP_gene_cat_list.tsv
    Ignored:    data/SNP_supp_schneider.RDS
    Ignored:    data/TE_info/
    Ignored:    data/TFmapnames.RDS
    Ignored:    data/all_TSSE_scores.RDS
    Ignored:    data/all_four_filtered_counts.txt
    Ignored:    data/aln_run1_results.txt
    Ignored:    data/anno_ind1_DA24h.RDS
    Ignored:    data/anno_ind4_V24h.RDS
    Ignored:    data/annotated_gwas_SNPS.csv
    Ignored:    data/background_n45_he_peaks.RDS
    Ignored:    data/cardiac_muscle_FRIP.csv
    Ignored:    data/cardiomyocyte_FRIP.csv
    Ignored:    data/col_ng_peak.csv
    Ignored:    data/cormotif_full_4_run.RDS
    Ignored:    data/cormotif_full_4_run_he.RDS
    Ignored:    data/cormotif_full_6_run.RDS
    Ignored:    data/cormotif_full_6_run_he.RDS
    Ignored:    data/cormotif_probability_45_list.csv
    Ignored:    data/cormotif_probability_45_list_he.csv
    Ignored:    data/cormotif_probability_all_6_list.csv
    Ignored:    data/cormotif_probability_all_6_list_he.csv
    Ignored:    data/datasave.RDS
    Ignored:    data/embryo_heart_FRIP.csv
    Ignored:    data/enhancer_list_ENCFF126UHK.bed
    Ignored:    data/enhancerdata/
    Ignored:    data/filt_Peaks_efit2.RDS
    Ignored:    data/filt_Peaks_efit2_bl.RDS
    Ignored:    data/filt_Peaks_efit2_n45.RDS
    Ignored:    data/first_Peaksummarycounts.csv
    Ignored:    data/first_run_frag_counts.txt
    Ignored:    data/full_bedfiles/
    Ignored:    data/gene_ref.csv
    Ignored:    data/gwas_1_dataframe.RDS
    Ignored:    data/gwas_2_dataframe.RDS
    Ignored:    data/gwas_3_dataframe.RDS
    Ignored:    data/gwas_4_dataframe.RDS
    Ignored:    data/gwas_5_dataframe.RDS
    Ignored:    data/high_conf_peak_counts.csv
    Ignored:    data/high_conf_peak_counts.txt
    Ignored:    data/high_conf_peaks_bl_counts.txt
    Ignored:    data/high_conf_peaks_counts.txt
    Ignored:    data/hits_files/
    Ignored:    data/hyper_files/
    Ignored:    data/hypo_files/
    Ignored:    data/ind1_DA24hpeaks.RDS
    Ignored:    data/ind1_TSSE.RDS
    Ignored:    data/ind2_TSSE.RDS
    Ignored:    data/ind3_TSSE.RDS
    Ignored:    data/ind4_TSSE.RDS
    Ignored:    data/ind4_V24hpeaks.RDS
    Ignored:    data/ind5_TSSE.RDS
    Ignored:    data/ind6_TSSE.RDS
    Ignored:    data/initial_complete_stats_run1.txt
    Ignored:    data/left_ventricle_FRIP.csv
    Ignored:    data/median_24_lfc.RDS
    Ignored:    data/median_3_lfc.RDS
    Ignored:    data/mergedPeads.gff
    Ignored:    data/mergedPeaks.gff
    Ignored:    data/motif_list_full
    Ignored:    data/motif_list_n45
    Ignored:    data/motif_list_n45.RDS
    Ignored:    data/multiqc_fastqc_run1.txt
    Ignored:    data/multiqc_fastqc_run2.txt
    Ignored:    data/multiqc_genestat_run1.txt
    Ignored:    data/multiqc_genestat_run2.txt
    Ignored:    data/my_hc_filt_counts.RDS
    Ignored:    data/my_hc_filt_counts_n45.RDS
    Ignored:    data/n45_bedfiles/
    Ignored:    data/n45_files
    Ignored:    data/other_papers/
    Ignored:    data/peakAnnoList_1.RDS
    Ignored:    data/peakAnnoList_2.RDS
    Ignored:    data/peakAnnoList_24_full.RDS
    Ignored:    data/peakAnnoList_24_n45.RDS
    Ignored:    data/peakAnnoList_3.RDS
    Ignored:    data/peakAnnoList_3_full.RDS
    Ignored:    data/peakAnnoList_3_n45.RDS
    Ignored:    data/peakAnnoList_4.RDS
    Ignored:    data/peakAnnoList_5.RDS
    Ignored:    data/peakAnnoList_6.RDS
    Ignored:    data/peakAnnoList_Eight.RDS
    Ignored:    data/peakAnnoList_full_motif.RDS
    Ignored:    data/peakAnnoList_n45_motif.RDS
    Ignored:    data/siglist_full.RDS
    Ignored:    data/siglist_n45.RDS
    Ignored:    data/summarized_peaks_dataframe.txt
    Ignored:    data/summary_peakIDandReHeat.csv
    Ignored:    data/test.list.RDS
    Ignored:    data/testnames.txt
    Ignored:    data/toplist_6.RDS
    Ignored:    data/toplist_full.RDS
    Ignored:    data/toplist_full_DAR_6.RDS
    Ignored:    data/toplist_n45.RDS
    Ignored:    data/trimmed_seq_length.csv
    Ignored:    data/unclassified_full_set_peaks.RDS
    Ignored:    data/unclassified_n45_set_peaks.RDS
    Ignored:    data/xstreme/

Untracked files:
    Untracked:  RNA_seq_integration.Rmd
    Untracked:  analysis/.gitignore
    Untracked:  analysis/Diagnosis-tmm.Rmd
    Untracked:  analysis/Expressed_RNA_associations.Rmd
    Untracked:  analysis/LFC_corr.Rmd
    Untracked:  analysis/SVA.Rmd
    Untracked:  analysis/Tan2020.Rmd
    Untracked:  analysis/making_master_peaks_list.Rmd
    Untracked:  analysis/my_hc_filt_counts.csv
    Untracked:  code/IGV_snapshot_code.R
    Untracked:  code/LongDARlist.R
    Untracked:  code/just_for_Fun.R
    Untracked:  output/cormotif_probability_45_list.csv
    Untracked:  output/cormotif_probability_all_6_list.csv
    Untracked:  setup.RData

Unstaged changes:
    Modified:   ATAC_learning.Rproj
    Modified:   analysis/AF_HF_SNPs.Rmd
    Modified:   analysis/Cardiotox_SNPs.Rmd
    Modified:   analysis/H3K27ac_cormotif.Rmd
    Modified:   analysis/Jaspar_motif.Rmd
    Modified:   analysis/Jaspar_motif_ff.Rmd
    Modified:   analysis/RNA_seq_integration.Rmd
    Modified:   analysis/final_four_analysis.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Enhancer_enrichment.Rmd) and HTML (docs/Enhancer_enrichment.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	b62ef0b	reneeisnowhere	2025-05-15	updates and verification of runs
html	5e6e462	reneeisnowhere	2025-05-07	Build site.
Rmd	2db35c7	reneeisnowhere	2025-05-07	updates to analysis

library(tidyverse)
library(kableExtra)
library(broom)
library(RColorBrewer)
library(ChIPseeker)
library("TxDb.Hsapiens.UCSC.hg38.knownGene")
library("org.Hs.eg.db")
library(rtracklayer)
library(edgeR)
library(ggfortify)
library(limma)
library(readr)
library(BiocGenerics)
library(gridExtra)
library(VennDiagram)
library(scales)
library(BiocParallel)
library(ggpubr)
library(devtools)
library(biomaRt)
library(eulerr)
library(smplot2)
library(genomation)
library(ggsignif)
library(plyranges)
library(ggrepel)
library(epitools)
library(circlize)

Collapsed_peaks <- read_delim("data/Final_four_data/collapsed_new_peaks.txt",
                              delim = "\t", 
                              escape_double = FALSE, 
                              trim_ws = TRUE)

Motif_list_gr <- readRDS("data/Final_four_data/re_analysis/Motif_list_granges.RDS")
##order specific
df_list <- plyr::llply(Motif_list_gr, as.data.frame)
### no change motif_list_gr names so they do not overwrite the dataframes
names(Motif_list_gr) <- paste0(names(Motif_list_gr), "_gr")
list2env(Motif_list_gr,envir= .GlobalEnv)

<environment: R_GlobalEnv>

list2env(df_list,envir= .GlobalEnv)

<environment: R_GlobalEnv>

mrc_lookup <- bind_rows(
  (EAR_open  %>% dplyr::select(Peakid) %>% mutate(mrc = "EAR_open")),  
  (EAR_close %>%  dplyr::select(Peakid) %>%mutate(mrc = "EAR_close")),
  (ESR_open  %>%  dplyr::select(Peakid) %>%mutate(mrc = "ESR_open")),
  (ESR_close %>%  dplyr::select(Peakid) %>%mutate(mrc = "ESR_close")),
  (ESR_opcl   %>%  dplyr::select(Peakid) %>%mutate(mrc = "ESR_opcl")),
  (ESR_clop   %>%  dplyr::select(Peakid) %>%mutate(mrc = "ESR_clop")),
  (LR_open   %>%  dplyr::select(Peakid) %>%mutate(mrc = "LR_open")),
  (LR_close  %>%  dplyr::select(Peakid) %>%mutate(mrc = "LR_close")),
  (NR        %>%  dplyr::select(Peakid) %>%mutate(mrc = "NR"))
) %>%
  distinct(Peakid, mrc) 

final_peaks <- Collapsed_peaks %>% 
  dplyr::filter(Peakid %in% mcols(all_regions_gr)$Peakid) %>%
  left_join(.,(mrc_lookup %>% 
              dplyr::select(Peakid, mrc)), by = "Peakid") %>%
  GRanges()

Enhancers and Response clusters

First: obtained a list of cis Regulatory Elements from Encode Screen [(https://screen.encodeproject.org/#)]

cREs_HLV_46F <- genomation::readBed("data/enhancerdata/ENCFF867HAD_ENCFF152PBB_ENCFF352YYH_ENCFF252IVK.7group.bed")

# cREs_HLV_53F <- genomation::readBed("data/enhancerdata/ENCFF417JSF_ENCFF651XRK_ENCFF320IPT_ENCFF440RUS.7group.bed") 
 
NR_cREs <- join_overlap_intersect(NR_gr,cREs_HLV_46F)
LR_open_cREs <- join_overlap_intersect(LR_open_gr,cREs_HLV_46F)
LR_close_cREs <- join_overlap_intersect(LR_close_gr,cREs_HLV_46F)
ESR_open_cREs <- join_overlap_intersect(ESR_open_gr,cREs_HLV_46F)
ESR_close_cREs <- join_overlap_intersect(ESR_close_gr,cREs_HLV_46F)

ESR_opcl_cREs <- join_overlap_intersect(ESR_opcl_gr, cREs_HLV_46F)
ESR_clop_cREs <- join_overlap_intersect(ESR_clop_gr, cREs_HLV_46F)
EAR_open_cREs <- join_overlap_intersect(EAR_open_gr,cREs_HLV_46F)
EAR_close_cREs <- join_overlap_intersect(EAR_close_gr,cREs_HLV_46F)
### These unique peaks are cREs that contain all types of cREs such as
# [1] "Low-DNase"                "DNase-only"               "CTCF-only,CTCF-bound"    
#  [4] "PLS,CTCF-bound"           "PLS"                      "dELS"                    
#  [7] "pELS"                     "DNase-H3K4me3"            "DNase-H3K4me3,CTCF-bound"
# [10] "dELS,CTCF-bound"          "pELS,CTCF-bound" #### NOT the exact ones I am interested in.      
 uni_EAR_open  <-  EAR_open_cREs %>% as.data.frame() %>% distinct(Peakid)
 uni_EAR_close <- EAR_close_cREs %>% as.data.frame() %>% distinct(Peakid)
 uni_ESR_open <- ESR_open_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_ESR_close <- ESR_close_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_ESR_opcl <- ESR_opcl_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_ESR_clop <- ESR_clop_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_LR_open <- LR_open_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_LR_close <- LR_close_cREs%>% as.data.frame() %>% distinct(Peakid)
 uni_NR <- NR_cREs%>% as.data.frame() %>% distinct(Peakid)


Whole_peaks <- join_overlap_intersect(final_peaks, cREs_HLV_46F)


Whole_peaks %>% 
  as.data.frame() %>% 
  left_join(.,mrc_lookup, by = c("Peakid","mrc"))%>% 
  group_by(blockCount, mrc) %>% 
  tally %>% 
  pivot_wider(., id_cols = mrc, names_from = blockCount, values_from = n) %>% 
  dplyr::select(mrc, PLS:'pELS,CTCF-bound') %>% 
  kable(., caption="Breakdown of peaks overlapping cREs") %>% 
  kable_paper("striped", full_width = TRUE) %>%
  kable_styling(full_width = FALSE, font_size = 14)

Breakdown of peaks overlapping cREs
mrc	PLS	PLS,CTCF-bound	dELS	dELS,CTCF-bound	pELS	pELS,CTCF-bound
EAR_close	13	5	264	12	64	6
EAR_open	1185	219	215	60	1205	193
ESR_close	202	31	1276	174	399	75
ESR_opcl	1	NA	16	1	7	1
ESR_open	120	25	197	39	255	47
LR_close	1251	315	2190	441	1443	312
LR_open	125	17	877	78	280	47
NR	10961	3057	6511	1685	12260	2852
NA	128	34	246	52	151	39
ESR_clop	NA	NA	17	NA	5	NA

keep_cRE_names <- c("CTCF-only,CTCF-bound" ,"PLS,CTCF-bound","PLS","dELS,CTCF-bound", "pELS","pELS,CTCF-bound","dELS")
is_cRE <- Whole_peaks %>% 
  as.data.frame() %>% 
  dplyr::filter(blockCount %in% keep_cRE_names) %>% 
  distinct(Peakid,blockCount) 

is_CTCF <- Whole_peaks %>% 
  as.data.frame() %>% 
  dplyr::filter(blockCount == "CTCF-only,CTCF-bound") %>% 
  distinct(Peakid,blockCount) 

is_dELS <- Whole_peaks %>% 
  as.data.frame() %>% 
  dplyr::filter(blockCount == "dELS,CTCF-bound"|blockCount == "dELS") %>% 
  distinct(Peakid,blockCount) 
is_pELS <- Whole_peaks %>% 
  as.data.frame() %>% 
  dplyr::filter(blockCount == "pELS,CTCF-bound"|blockCount == "pELS") %>% 
  distinct(Peakid,blockCount) 
is_PLS <- Whole_peaks %>% 
  as.data.frame() %>% 
  dplyr::filter(blockCount == "PLS,CTCF-bound"|blockCount == "PLS") %>% 
  distinct(Peakid,blockCount) 


CRE_summary <-final_peaks %>% 
  as.data.frame() %>% 
   mutate(cRE_status=if_else(Peakid %in% is_cRE$Peakid,"cRE_peak","not_cRE_peak")) %>% 
   mutate(CTCF_status=if_else(Peakid %in% is_CTCF$Peakid,"CTCF_peak","not_CTCF_peak")) %>% 
    mutate(dELS_status=if_else(Peakid %in% is_dELS$Peakid,"dELS_peak","not_dELS_peak")) %>% 
    mutate(pELS_status=if_else(Peakid %in% is_pELS$Peakid,"pELS_peak","not_pELS_peak")) %>% 
    mutate(PLS_status=if_else(Peakid %in% is_PLS$Peakid,"PLS_peak","not_PLS_peak")) %>% 
  dplyr::select(Peakid:PLS_status)

cRE_mat<- CRE_summary %>%
 dplyr::filter(mrc != "not_mrc") %>% 
  group_by(cRE_status, mrc) %>% 
  tally %>% 
 mutate(mrc=factor(mrc, levels = c("EAR_open","ESR_open","LR_open","ESR_opcl", "EAR_close","ESR_close","LR_close","ESR_clop", "NR"))) %>% 
    pivot_wider(id_cols = mrc, names_from = cRE_status,values_from = n) %>% 
  column_to_rownames("mrc") %>% 
  na.omit(.) %>% 
  as.matrix(.)
cRE_mat

          cRE_peak not_cRE_peak
EAR_close      322         2753
EAR_open      1518         3381
ESR_clop        18          696
ESR_close     1846         6088
ESR_opcl        24          179
ESR_open       489         5788
LR_close      4411        14199
LR_open       1206        24411
NR           22087        63067

CTCF_mat<- CRE_summary %>% 
 dplyr::filter(mrc != "not_mrc") %>% 
  group_by(CTCF_status, mrc) %>% 
  tally %>% 
  mutate(mrc=factor(mrc, levels = c("EAR_open","ESR_open","LR_open","ESR_opcl", "EAR_close","ESR_close","LR_close","ESR_clop", "NR"))) %>% 
  pivot_wider(id_cols = mrc, names_from = CTCF_status,values_from = n) %>% 
  column_to_rownames("mrc") %>% 
  na.omit(.) %>% 
  as.matrix(.)
CTCF_mat

          CTCF_peak not_CTCF_peak
EAR_close        53          3022
EAR_open        185          4714
ESR_close       289          7645
ESR_opcl          4           199
ESR_open         65          6212
LR_close        896         17714
LR_open         131         25486
NR             5280         79874

dELS_mat<- CRE_summary %>% 
 dplyr::filter(mrc != "not_mrc") %>% 
  group_by(dELS_status, mrc) %>% 
  tally %>% 
  mutate(mrc=factor(mrc, levels = c("EAR_open","ESR_open","LR_open","ESR_opcl", "EAR_close","ESR_close","LR_close","ESR_clop", "NR"))) %>% 
  pivot_wider(id_cols = mrc, names_from = dELS_status,values_from = n) %>% 
  column_to_rownames("mrc") %>% 
  na.omit(.) %>% 
  as.matrix(.)
dELS_mat

          dELS_peak not_dELS_peak
EAR_close       205          2870
EAR_open        208          4691
ESR_clop         14           700
ESR_close      1113          6821
ESR_opcl         14           189
ESR_open        186          6091
LR_close       1964         16646
LR_open         774         24843
NR             5686         79468

pELS_mat<- CRE_summary %>% 
 dplyr::filter(mrc != "not_mrc") %>% 
  group_by(pELS_status, mrc) %>% 
  tally %>% 
  mutate(mrc=factor(mrc, levels = c("EAR_open","ESR_open","LR_open","ESR_opcl", "EAR_close","ESR_close","LR_close","ESR_clop", "NR"))) %>% 
  pivot_wider(id_cols = mrc, names_from = pELS_status,values_from = n) %>% 
  column_to_rownames("mrc") %>% 
  na.omit(.) %>% 
  as.matrix(.)
pELS_mat

          not_pELS_peak pELS_peak
EAR_close          3020        55
EAR_open           4108       791
ESR_clop            710         4
ESR_close          7584       350
ESR_opcl            196         7
ESR_open           6090       187
LR_close          17499      1111
LR_open           25391       226
NR                77156      7998

PLS_mat<- CRE_summary %>% 
 dplyr::filter(mrc != "not_mrc") %>% 
  group_by(PLS_status, mrc) %>% 
  tally %>% 
  mutate(mrc=factor(mrc, levels = c("EAR_open","ESR_open","LR_open","ESR_opcl", "EAR_close","ESR_close","LR_close","ESR_clop", "NR"))) %>% 
  pivot_wider(id_cols = mrc, names_from = PLS_status,values_from = n) %>% 
  column_to_rownames("mrc") %>% 
  na.omit(.) %>% 
  as.matrix(.)
PLS_mat

          PLS_peak not_PLS_peak
EAR_close       15         3060
EAR_open       946         3953
ESR_close      194         7740
ESR_opcl         1          202
ESR_open       115         6162
LR_close      1174        17436
LR_open        124        25493
NR            9443        75711

odds ratio results

matrix_list <- list("All cREs"= cRE_mat,"PLS"=PLS_mat, "dELS"=dELS_mat, "pELS"=pELS_mat,"CTCF"= CTCF_mat)

results_or <- data.frame(Matrix_Name = character(),
                      Row_Compared = character(),
                      Odds_Ratio = numeric(),
                      Lower_CI = numeric(),
                      Upper_CI = numeric(),
                      P_Value = numeric(),
                      stringsAsFactors = FALSE)

# Loop through each matrix in the list
for (matrix_name in names(matrix_list)) {
  current_matrix <- matrix_list[[matrix_name]]
  n_rows <- nrow(current_matrix)
  
  # Loop through each row of the current matrix (except the last row)
  for (i in 1:(n_rows - 1)) {
    # Perform odds ratio test between row i and the last row using epitools
    test_result <- tryCatch({
      contingency_table <- rbind(current_matrix[i, ], current_matrix[n_rows, ])
      
      # Check if any row in the contingency table contains only zeros
      if (any(rowSums(contingency_table) == 0)) {
        stop("Contingency table contains empty rows.")
      }
      
      oddsratio_result <- oddsratio(contingency_table)
       # Ensure the oddsratio result has at least 2 rows
      if (nrow(oddsratio_result$measure) < 2) {
        stop("oddsratio result does not have enough data.")
      }
      
     list(oddsratio = oddsratio_result, p.value = oddsratio_result$p.value[2,"chi.square"])
      
    }, error = function(e) {
      cat("Error in odds ratio test for row", i, "in matrix", matrix_name, ":", e$message, "\n")
      return(NULL)
    })
    
    # Only store the result if test_result is valid (i.e., not NULL)
    if (!is.null(test_result)) {
      or_value <- test_result$oddsratio$measure[2, "estimate"]
      lower_ci <- test_result$oddsratio$measure[2, "lower"]
      upper_ci <- test_result$oddsratio$measure[2, "upper"]
      p_value <- test_result$oddsratio$p.value[2,"chi.square"]
      
      # Check if the values are numeric and valid (not NA)
      if (!is.na(or_value) && !is.na(lower_ci) && !is.na(upper_ci) && !is.na(p_value)) {
        # Store the results in the dataframe
        results_or <- rbind(results_or, data.frame(Matrix_Name = matrix_name,
                                             Row_Compared = rownames(current_matrix)[i],
                                             Odds_Ratio = or_value,
                                             Lower_CI = lower_ci,
                                             Upper_CI = upper_ci,
                                             P_Value = p_value))
      }
    }
  }
}

# Print the resulting dataframe
print(results_or) %>% 
  kable(., caption = "Odd ratio results and significance values of TSS and CpG enrichment compared to No response group") %>% 
  kable_paper("striped", full_width = TRUE) %>%
  kable_styling(full_width = FALSE, font_size = 14) %>% 
  scroll_box(width = "100%", height = "400px")

           Matrix_Name Row_Compared  Odds_Ratio     Lower_CI    Upper_CI
estimate      All cREs    EAR_close  0.33413089  0.296852431  0.37476753
estimate1     All cREs     EAR_open  1.28208081  1.204176799  1.36444549
estimate2     All cREs     ESR_clop  0.07449141  0.044881181  0.11526243
estimate3     All cREs    ESR_close  0.86583822  0.819919213  0.91394794
estimate4     All cREs     ESR_opcl  0.38513592  0.244965612  0.57776218
estimate5     All cREs     ESR_open  0.24129445  0.219499033  0.26468097
estimate6     All cREs     LR_close  0.88706240  0.854672238  0.92051932
estimate7     All cREs      LR_open  0.14109037  0.132835268  0.14971335
estimate8          PLS    EAR_close  0.03975658  0.022797655  0.06349348
estimate9          PLS     EAR_open  1.91893859  1.781028384  2.06546507
estimate10         PLS    ESR_close  0.20112102  0.173599679  0.23158497
estimate11         PLS     ESR_opcl  0.04547902  0.001960244  0.19763155
estimate12         PLS     ESR_open  0.14985132  0.123743139  0.17948544
estimate13         PLS     LR_close  0.53990850  0.506787691  0.57468362
estimate14         PLS      LR_open  0.03905158  0.032501508  0.04641806
estimate15        dELS    EAR_close  0.99900769  0.862333112  1.15089561
estimate16        dELS     EAR_open  0.62013489  0.536764284  0.71242916
estimate17        dELS     ESR_clop  0.28276093  0.158262160  0.46103441
estimate18        dELS    ESR_close  2.28074889  2.128141997  2.44231013
estimate19        dELS     ESR_opcl  1.04649971  0.578818829  1.73536017
estimate20        dELS     ESR_open  0.42714993  0.367018246  0.49390476
estimate21        dELS     LR_close  1.64905922  1.562134953  1.74023431
estimate22        dELS      LR_open  0.43551277  0.403167198  0.46979014
estimate23        pELS    EAR_close  5.67510713  4.389460507  7.50639156
estimate24        pELS     EAR_open  0.53827209  0.497411548  0.58317755
estimate25        pELS     ESR_clop 17.69111986  7.596472261 58.02448266
estimate26        pELS    ESR_close  2.24519918  2.015399481  2.50961461
estimate27        pELS     ESR_opcl  2.83866273  1.442613956  6.71725577
estimate28        pELS     ESR_open  3.37301551  2.920119420  3.92119045
estimate29        pELS     LR_close  1.63251469  1.530689086  1.74282701
estimate30        pELS      LR_open 11.63779496 10.215900039 13.32964519
estimate31        CTCF    EAR_close  0.26612181  0.200005260  0.34577686
estimate32        CTCF     EAR_open  0.59417924  0.509991703  0.68785377
estimate33        CTCF    ESR_close  0.57214560  0.506046669  0.64430623
estimate34        CTCF     ESR_opcl  0.31605227  0.095743813  0.74376750
estimate35        CTCF     ESR_open  0.15868945  0.122843865  0.20107860
estimate36        CTCF     LR_close  0.76528751  0.711262119  0.82250029
estimate37        CTCF      LR_open  0.07785612  0.065111148  0.09222061
                 P_Value
estimate    1.802343e-83
estimate1   5.599162e-15
estimate2   4.381627e-46
estimate3   1.927568e-07
estimate4   4.541572e-06
estimate5  3.683956e-227
estimate6   2.371758e-10
estimate7   0.000000e+00
estimate8   8.886292e-78
estimate9   1.110003e-68
estimate10 4.276917e-129
estimate11  1.529496e-06
estimate12 2.101411e-118
estimate13  1.211921e-84
estimate14  0.000000e+00
estimate15  9.814613e-01
estimate16  2.206703e-11
estimate17  4.621167e-07
estimate18 5.432381e-128
estimate19  9.005404e-01
estimate20  4.973085e-31
estimate21  4.719946e-75
estimate22 3.078418e-106
estimate23  6.585935e-47
estimate24  4.096888e-54
estimate25  6.233974e-16
estimate26  6.780993e-50
estimate27  3.712335e-03
estimate28  4.040218e-66
estimate29  1.646500e-50
estimate30  0.000000e+00
estimate31  1.391090e-24
estimate32  4.816608e-12
estimate33  3.977173e-20
estimate34  1.248910e-02
estimate35  1.403504e-63
estimate36  4.517144e-13
estimate37 2.775286e-300

Odd ratio results and significance values of TSS and CpG enrichment compared to No response group
	Matrix_Name	Row_Compared	Odds_Ratio	Lower_CI	Upper_CI	P_Value
estimate	All cREs	EAR_close	0.3341309	0.2968524	0.3747675	0.0000000
estimate1	All cREs	EAR_open	1.2820808	1.2041768	1.3644455	0.0000000
estimate2	All cREs	ESR_clop	0.0744914	0.0448812	0.1152624	0.0000000
estimate3	All cREs	ESR_close	0.8658382	0.8199192	0.9139479	0.0000002
estimate4	All cREs	ESR_opcl	0.3851359	0.2449656	0.5777622	0.0000045
estimate5	All cREs	ESR_open	0.2412944	0.2194990	0.2646810	0.0000000
estimate6	All cREs	LR_close	0.8870624	0.8546722	0.9205193	0.0000000
estimate7	All cREs	LR_open	0.1410904	0.1328353	0.1497134	0.0000000
estimate8	PLS	EAR_close	0.0397566	0.0227977	0.0634935	0.0000000
estimate9	PLS	EAR_open	1.9189386	1.7810284	2.0654651	0.0000000
estimate10	PLS	ESR_close	0.2011210	0.1735997	0.2315850	0.0000000
estimate11	PLS	ESR_opcl	0.0454790	0.0019602	0.1976316	0.0000015
estimate12	PLS	ESR_open	0.1498513	0.1237431	0.1794854	0.0000000
estimate13	PLS	LR_close	0.5399085	0.5067877	0.5746836	0.0000000
estimate14	PLS	LR_open	0.0390516	0.0325015	0.0464181	0.0000000
estimate15	dELS	EAR_close	0.9990077	0.8623331	1.1508956	0.9814613
estimate16	dELS	EAR_open	0.6201349	0.5367643	0.7124292	0.0000000
estimate17	dELS	ESR_clop	0.2827609	0.1582622	0.4610344	0.0000005
estimate18	dELS	ESR_close	2.2807489	2.1281420	2.4423101	0.0000000
estimate19	dELS	ESR_opcl	1.0464997	0.5788188	1.7353602	0.9005404
estimate20	dELS	ESR_open	0.4271499	0.3670182	0.4939048	0.0000000
estimate21	dELS	LR_close	1.6490592	1.5621350	1.7402343	0.0000000
estimate22	dELS	LR_open	0.4355128	0.4031672	0.4697901	0.0000000
estimate23	pELS	EAR_close	5.6751071	4.3894605	7.5063916	0.0000000
estimate24	pELS	EAR_open	0.5382721	0.4974115	0.5831775	0.0000000
estimate25	pELS	ESR_clop	17.6911199	7.5964723	58.0244827	0.0000000
estimate26	pELS	ESR_close	2.2451992	2.0153995	2.5096146	0.0000000
estimate27	pELS	ESR_opcl	2.8386627	1.4426140	6.7172558	0.0037123
estimate28	pELS	ESR_open	3.3730155	2.9201194	3.9211904	0.0000000
estimate29	pELS	LR_close	1.6325147	1.5306891	1.7428270	0.0000000
estimate30	pELS	LR_open	11.6377950	10.2159000	13.3296452	0.0000000
estimate31	CTCF	EAR_close	0.2661218	0.2000053	0.3457769	0.0000000
estimate32	CTCF	EAR_open	0.5941792	0.5099917	0.6878538	0.0000000
estimate33	CTCF	ESR_close	0.5721456	0.5060467	0.6443062	0.0000000
estimate34	CTCF	ESR_opcl	0.3160523	0.0957438	0.7437675	0.0124891
estimate35	CTCF	ESR_open	0.1586895	0.1228439	0.2010786	0.0000000
estimate36	CTCF	LR_close	0.7652875	0.7112621	0.8225003	0.0000000
estimate37	CTCF	LR_open	0.0778561	0.0651111	0.0922206	0.0000000

col_fun_OR = colorRamp2(c(0,1,1.5,5), c("blueviolet","white","lightgreen","green3" ))
sig_mat_OR <- results_or %>% 
  as.data.frame() %>% 
  dplyr::select( Matrix_Name,Row_Compared,P_Value) %>%
  group_by(Row_Compared) %>%
  mutate(rank_val=rank(P_Value, ties.method = "first")) %>%
  mutate(BH_correction= p.adjust(P_Value,method= "BH")) %>% 
  pivot_wider(., id_cols = Matrix_Name, names_from = Row_Compared, values_from = BH_correction) %>% 
  dplyr::select(Matrix_Name,EAR_open,ESR_open,LR_open,ESR_opcl,EAR_close,ESR_close,LR_close,ESR_clop) %>%
  column_to_rownames("Matrix_Name") %>% 
  as.matrix() 

# saveRDS(results_or,"data/Final_four_data/re_analysis/OR_results_cREs_df_1bp.RDS")

results_or %>% 
  as.data.frame() %>% 
  dplyr::select( Matrix_Name,Row_Compared,Odds_Ratio) %>% 
  pivot_wider(., id_cols = Matrix_Name, names_from = Row_Compared, values_from = Odds_Ratio) %>% 
  dplyr::select(Matrix_Name,EAR_open,ESR_open,LR_open,ESR_opcl,EAR_close,ESR_close,LR_close,ESR_clop) %>%
  column_to_rownames("Matrix_Name") %>% 
  as.matrix() %>% 
  ComplexHeatmap::Heatmap(. ,col = col_fun_OR, 
                          cluster_rows=FALSE, 
                          cluster_columns=FALSE, 
                          column_names_side = "top", 
                          column_names_rot = 45,
                          # na_col = "black",
                          cell_fun = function(j, i, x, y, width, height, fill) {if (!is.na(sig_mat_OR[i, j]) && sig_mat_OR[i, j] < 0.05  && .[i, j] > 1) {
            grid.text("*", x, y, gp = gpar(fontsize = 20))}})

Version	Author	Date
5e6e462	reneeisnowhere	2025-05-07

note, this is corrected for multiple testing across categories within each motif cluster.

sessionInfo()

R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] circlize_0.4.16                         
 [2] epitools_0.5-10.1                       
 [3] ggrepel_0.9.6                           
 [4] plyranges_1.26.0                        
 [5] ggsignif_0.6.4                          
 [6] genomation_1.38.0                       
 [7] smplot2_0.2.5                           
 [8] eulerr_7.0.2                            
 [9] biomaRt_2.62.1                          
[10] devtools_2.4.5                          
[11] usethis_3.1.0                           
[12] ggpubr_0.6.0                            
[13] BiocParallel_1.40.0                     
[14] scales_1.3.0                            
[15] VennDiagram_1.7.3                       
[16] futile.logger_1.4.3                     
[17] gridExtra_2.3                           
[18] ggfortify_0.4.17                        
[19] edgeR_4.4.2                             
[20] limma_3.62.2                            
[21] rtracklayer_1.66.0                      
[22] org.Hs.eg.db_3.20.0                     
[23] TxDb.Hsapiens.UCSC.hg38.knownGene_3.20.0
[24] GenomicFeatures_1.58.0                  
[25] AnnotationDbi_1.68.0                    
[26] Biobase_2.66.0                          
[27] GenomicRanges_1.58.0                    
[28] GenomeInfoDb_1.42.3                     
[29] IRanges_2.40.1                          
[30] S4Vectors_0.44.0                        
[31] BiocGenerics_0.52.0                     
[32] ChIPseeker_1.42.1                       
[33] RColorBrewer_1.1-3                      
[34] broom_1.0.7                             
[35] kableExtra_1.4.0                        
[36] lubridate_1.9.4                         
[37] forcats_1.0.0                           
[38] stringr_1.5.1                           
[39] dplyr_1.1.4                             
[40] purrr_1.0.4                             
[41] readr_2.1.5                             
[42] tidyr_1.3.1                             
[43] tibble_3.2.1                            
[44] ggplot2_3.5.1                           
[45] tidyverse_2.0.0                         
[46] workflowr_1.7.1                         

loaded via a namespace (and not attached):
  [1] fs_1.6.5                               
  [2] matrixStats_1.5.0                      
  [3] bitops_1.0-9                           
  [4] enrichplot_1.26.6                      
  [5] doParallel_1.0.17                      
  [6] httr_1.4.7                             
  [7] profvis_0.4.0                          
  [8] tools_4.4.2                            
  [9] backports_1.5.0                        
 [10] R6_2.6.1                               
 [11] lazyeval_0.2.2                         
 [12] GetoptLong_1.0.5                       
 [13] urlchecker_1.0.1                       
 [14] withr_3.0.2                            
 [15] prettyunits_1.2.0                      
 [16] cli_3.6.4                              
 [17] formatR_1.14                           
 [18] Cairo_1.6-2                            
 [19] sass_0.4.9                             
 [20] Rsamtools_2.22.0                       
 [21] systemfonts_1.2.1                      
 [22] yulab.utils_0.2.0                      
 [23] foreign_0.8-88                         
 [24] DOSE_4.0.0                             
 [25] svglite_2.1.3                          
 [26] R.utils_2.13.0                         
 [27] sessioninfo_1.2.3                      
 [28] plotrix_3.8-4                          
 [29] BSgenome_1.74.0                        
 [30] pwr_1.3-0                              
 [31] impute_1.80.0                          
 [32] rstudioapi_0.17.1                      
 [33] RSQLite_2.3.9                          
 [34] shape_1.4.6.1                          
 [35] generics_0.1.3                         
 [36] gridGraphics_0.5-1                     
 [37] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [38] BiocIO_1.16.0                          
 [39] vroom_1.6.5                            
 [40] gtools_3.9.5                           
 [41] car_3.1-3                              
 [42] GO.db_3.20.0                           
 [43] Matrix_1.7-3                           
 [44] abind_1.4-8                            
 [45] R.methodsS3_1.8.2                      
 [46] lifecycle_1.0.4                        
 [47] whisker_0.4.1                          
 [48] yaml_2.3.10                            
 [49] carData_3.0-5                          
 [50] SummarizedExperiment_1.36.0            
 [51] gplots_3.2.0                           
 [52] qvalue_2.38.0                          
 [53] SparseArray_1.6.2                      
 [54] BiocFileCache_2.14.0                   
 [55] blob_1.2.4                             
 [56] promises_1.3.2                         
 [57] crayon_1.5.3                           
 [58] miniUI_0.1.1.1                         
 [59] ggtangle_0.0.6                         
 [60] lattice_0.22-6                         
 [61] cowplot_1.1.3                          
 [62] KEGGREST_1.46.0                        
 [63] magick_2.8.5                           
 [64] ComplexHeatmap_2.22.0                  
 [65] pillar_1.10.1                          
 [66] knitr_1.49                             
 [67] fgsea_1.32.2                           
 [68] rjson_0.2.23                           
 [69] boot_1.3-31                            
 [70] codetools_0.2-20                       
 [71] fastmatch_1.1-6                        
 [72] glue_1.8.0                             
 [73] getPass_0.2-4                          
 [74] ggfun_0.1.8                            
 [75] data.table_1.17.0                      
 [76] remotes_2.5.0                          
 [77] vctrs_0.6.5                            
 [78] png_0.1-8                              
 [79] treeio_1.30.0                          
 [80] gtable_0.3.6                           
 [81] cachem_1.1.0                           
 [82] xfun_0.51                              
 [83] S4Arrays_1.6.0                         
 [84] mime_0.12                              
 [85] iterators_1.0.14                       
 [86] statmod_1.5.0                          
 [87] ellipsis_0.3.2                         
 [88] nlme_3.1-167                           
 [89] ggtree_3.14.0                          
 [90] bit64_4.6.0-1                          
 [91] filelock_1.0.3                         
 [92] progress_1.2.3                         
 [93] rprojroot_2.0.4                        
 [94] bslib_0.9.0                            
 [95] rpart_4.1.24                           
 [96] KernSmooth_2.23-26                     
 [97] Hmisc_5.2-2                            
 [98] colorspace_2.1-1                       
 [99] DBI_1.2.3                              
[100] seqPattern_1.38.0                      
[101] nnet_7.3-20                            
[102] tidyselect_1.2.1                       
[103] processx_3.8.6                         
[104] bit_4.6.0                              
[105] compiler_4.4.2                         
[106] curl_6.2.1                             
[107] git2r_0.35.0                           
[108] httr2_1.1.1                            
[109] htmlTable_2.4.3                        
[110] xml2_1.3.7                             
[111] DelayedArray_0.32.0                    
[112] checkmate_2.3.2                        
[113] caTools_1.18.3                         
[114] callr_3.7.6                            
[115] rappdirs_0.3.3                         
[116] digest_0.6.37                          
[117] rmarkdown_2.29                         
[118] XVector_0.46.0                         
[119] base64enc_0.1-3                        
[120] htmltools_0.5.8.1                      
[121] pkgconfig_2.0.3                        
[122] MatrixGenerics_1.18.1                  
[123] dbplyr_2.5.0                           
[124] fastmap_1.2.0                          
[125] GlobalOptions_0.1.2                    
[126] rlang_1.1.5                            
[127] htmlwidgets_1.6.4                      
[128] UCSC.utils_1.2.0                       
[129] shiny_1.10.0                           
[130] farver_2.1.2                           
[131] jquerylib_0.1.4                        
[132] zoo_1.8-13                             
[133] jsonlite_1.9.1                         
[134] GOSemSim_2.32.0                        
[135] R.oo_1.27.0                            
[136] RCurl_1.98-1.16                        
[137] magrittr_2.0.3                         
[138] Formula_1.2-5                          
[139] GenomeInfoDbData_1.2.13                
[140] ggplotify_0.1.2                        
[141] patchwork_1.3.0                        
[142] munsell_0.5.1                          
[143] Rcpp_1.0.14                            
[144] ape_5.8-1                              
[145] stringi_1.8.4                          
[146] zlibbioc_1.52.0                        
[147] plyr_1.8.9                             
[148] pkgbuild_1.4.6                         
[149] parallel_4.4.2                         
[150] Biostrings_2.74.1                      
[151] splines_4.4.2                          
[152] hms_1.1.3                              
[153] locfit_1.5-9.12                        
[154] ps_1.9.0                               
[155] igraph_2.1.4                           
[156] reshape2_1.4.4                         
[157] pkgload_1.4.0                          
[158] futile.options_1.0.1                   
[159] XML_3.99-0.18                          
[160] evaluate_1.0.3                         
[161] lambda.r_1.2.4                         
[162] foreach_1.5.2                          
[163] tzdb_0.4.0                             
[164] httpuv_1.6.15                          
[165] clue_0.3-66                            
[166] gridBase_0.4-7                         
[167] xtable_1.8-4                           
[168] restfulr_0.0.15                        
[169] tidytree_0.4.6                         
[170] rstatix_0.7.2                          
[171] later_1.4.1                            
[172] viridisLite_0.4.2                      
[173] aplot_0.2.5                            
[174] memoise_2.0.1                          
[175] GenomicAlignments_1.42.0               
[176] cluster_2.1.8.1                        
[177] timechange_0.3.0

Enhancer_enrichment

Renee Matthews

2025-05-07

Enhancers and Response clusters

odds ratio results