Last updated: 2023-09-28

Checks: 7 0

Knit directory: Cardiotoxicity/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20230109)

The command set.seed(20230109) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 87a496b

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 87a496b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/41588_2018_171_MOESM3_ESMeQTL_ST2_for paper.csv
    Ignored:    data/Arr_GWAS.txt
    Ignored:    data/Arr_geneset.RDS
    Ignored:    data/BC_cell_lines.csv
    Ignored:    data/BurridgeDOXTOX.RDS
    Ignored:    data/CADGWASgene_table.csv
    Ignored:    data/CAD_geneset.RDS
    Ignored:    data/CALIMA_Data/
    Ignored:    data/Clamp_Summary.csv
    Ignored:    data/Cormotif_24_k1-5_raw.RDS
    Ignored:    data/Counts_RNA_ERMatthews.txt
    Ignored:    data/DAgostres24.RDS
    Ignored:    data/DAtable1.csv
    Ignored:    data/DDEMresp_list.csv
    Ignored:    data/DDE_reQTL.txt
    Ignored:    data/DDEresp_list.csv
    Ignored:    data/DEG-GO/
    Ignored:    data/DEG_cormotif.RDS
    Ignored:    data/DF_Plate_Peak.csv
    Ignored:    data/DRC48hoursdata.csv
    Ignored:    data/Da24counts.txt
    Ignored:    data/Dx24counts.txt
    Ignored:    data/Dx_reQTL_specific.txt
    Ignored:    data/EPIstorelist24.RDS
    Ignored:    data/Ep24counts.txt
    Ignored:    data/Full_LD_rep.csv
    Ignored:    data/GOIsig.csv
    Ignored:    data/GOplots.R
    Ignored:    data/GTEX_setsimple.csv
    Ignored:    data/GTEX_sig24.RDS
    Ignored:    data/GTEx_gene_list.csv
    Ignored:    data/HFGWASgene_table.csv
    Ignored:    data/HF_geneset.RDS
    Ignored:    data/Heart_Left_Ventricle.v8.egenes.txt
    Ignored:    data/Heatmap_mat.RDS
    Ignored:    data/Heatmap_sig.RDS
    Ignored:    data/Hf_GWAS.txt
    Ignored:    data/K_cluster
    Ignored:    data/K_cluster_kisthree.csv
    Ignored:    data/K_cluster_kistwo.csv
    Ignored:    data/LD50_05via.csv
    Ignored:    data/LDH48hoursdata.csv
    Ignored:    data/Mt24counts.txt
    Ignored:    data/NoRespDEG_final.csv
    Ignored:    data/RINsamplelist.txt
    Ignored:    data/Seonane2019supp1.txt
    Ignored:    data/TMMnormed_x.RDS
    Ignored:    data/TOP2Bi-24hoursGO_analysis.csv
    Ignored:    data/TR24counts.txt
    Ignored:    data/TableS10.csv
    Ignored:    data/TableS11.csv
    Ignored:    data/TableS9.csv
    Ignored:    data/Top2biresp_cluster24h.csv
    Ignored:    data/Var_test_list.RDS
    Ignored:    data/Var_test_list24.RDS
    Ignored:    data/Var_test_list24alt.RDS
    Ignored:    data/Var_test_list3.RDS
    Ignored:    data/Vargenes.RDS
    Ignored:    data/Viabilitylistfull.csv
    Ignored:    data/allexpressedgenes.txt
    Ignored:    data/allfinal3hour.RDS
    Ignored:    data/allgenes.txt
    Ignored:    data/allmatrix.RDS
    Ignored:    data/allmymatrix.RDS
    Ignored:    data/annotation_data_frame.RDS
    Ignored:    data/averageviabilitytable.RDS
    Ignored:    data/avgLD50.RDS
    Ignored:    data/avg_LD50.RDS
    Ignored:    data/backGL.txt
    Ignored:    data/burr_genes.RDS
    Ignored:    data/calcium_data.RDS
    Ignored:    data/clamp_summary.RDS
    Ignored:    data/cormotif_3hk1-8.RDS
    Ignored:    data/cormotif_initalK5.RDS
    Ignored:    data/cormotif_initialK5.RDS
    Ignored:    data/cormotif_initialall.RDS
    Ignored:    data/cormotifprobs.csv
    Ignored:    data/counts24hours.RDS
    Ignored:    data/cpmcount.RDS
    Ignored:    data/cpmnorm_counts.csv
    Ignored:    data/crispr_genes.csv
    Ignored:    data/ctnnt_results.txt
    Ignored:    data/cvd_GWAS.txt
    Ignored:    data/dat_cpm.RDS
    Ignored:    data/data_outline.txt
    Ignored:    data/drug_noveh1.csv
    Ignored:    data/efit2.RDS
    Ignored:    data/efit2_final.RDS
    Ignored:    data/efit2results.RDS
    Ignored:    data/ensembl_backup.RDS
    Ignored:    data/ensgtotal.txt
    Ignored:    data/filcpm_counts.RDS
    Ignored:    data/filenameonly.txt
    Ignored:    data/filtered_cpm_counts.csv
    Ignored:    data/filtered_raw_counts.csv
    Ignored:    data/filtermatrix_x.RDS
    Ignored:    data/folder_05top/
    Ignored:    data/geneDoxonlyQTL.csv
    Ignored:    data/gene_corr_df.RDS
    Ignored:    data/gene_corr_frame.RDS
    Ignored:    data/gene_prob_tran3h.RDS
    Ignored:    data/gene_probabilityk5.RDS
    Ignored:    data/geneset_24.RDS
    Ignored:    data/gostresTop2bi_ER.RDS
    Ignored:    data/gostresTop2bi_LR
    Ignored:    data/gostresTop2bi_LR.RDS
    Ignored:    data/gostresTop2bi_TI.RDS
    Ignored:    data/gostrescoNR
    Ignored:    data/gtex/
    Ignored:    data/heartgenes.csv
    Ignored:    data/hsa_kegg_anno.RDS
    Ignored:    data/individualDRCfile.RDS
    Ignored:    data/individual_DRC48.RDS
    Ignored:    data/individual_LDH48.RDS
    Ignored:    data/indv_noveh1.csv
    Ignored:    data/kegglistDEG.RDS
    Ignored:    data/kegglistDEG24.RDS
    Ignored:    data/kegglistDEG3.RDS
    Ignored:    data/knowfig4.csv
    Ignored:    data/knowfig5.csv
    Ignored:    data/label_list.RDS
    Ignored:    data/ld50_table.csv
    Ignored:    data/mean_vardrug1.csv
    Ignored:    data/mean_varframe.csv
    Ignored:    data/mymatrix.RDS
    Ignored:    data/new_ld50avg.RDS
    Ignored:    data/nonresponse_cluster24h.csv
    Ignored:    data/norm_LDH.csv
    Ignored:    data/norm_counts.csv
    Ignored:    data/old_sets/
    Ignored:    data/organized_drugframe.csv
    Ignored:    data/plan2plot.png
    Ignored:    data/plot_intv_list.RDS
    Ignored:    data/plot_list_DRC.RDS
    Ignored:    data/qval24hr.RDS
    Ignored:    data/qval3hr.RDS
    Ignored:    data/qvalueEPItemp.RDS
    Ignored:    data/raw_counts.csv
    Ignored:    data/response_cluster24h.csv
    Ignored:    data/sigVDA24.txt
    Ignored:    data/sigVDA3.txt
    Ignored:    data/sigVDX24.txt
    Ignored:    data/sigVDX3.txt
    Ignored:    data/sigVEP24.txt
    Ignored:    data/sigVEP3.txt
    Ignored:    data/sigVMT24.txt
    Ignored:    data/sigVMT3.txt
    Ignored:    data/sigVTR24.txt
    Ignored:    data/sigVTR3.txt
    Ignored:    data/siglist.RDS
    Ignored:    data/siglist_final.RDS
    Ignored:    data/siglist_old.RDS
    Ignored:    data/slope_table.csv
    Ignored:    data/supp10_24hlist.RDS
    Ignored:    data/supp10_3hlist.RDS
    Ignored:    data/supp_normLDH48.RDS
    Ignored:    data/supp_pca_all_anno.RDS
    Ignored:    data/table3a.omar
    Ignored:    data/testlist.txt
    Ignored:    data/toplistall.RDS
    Ignored:    data/trtonly_24h_genes.RDS
    Ignored:    data/trtonly_3h_genes.RDS
    Ignored:    data/tvl24hour.txt
    Ignored:    data/tvl24hourw.txt
    Ignored:    data/venn_code.R
    Ignored:    data/viability.RDS

Untracked files:
    Untracked:  .RDataTmp
    Untracked:  .RDataTmp1
    Untracked:  .RDataTmp2
    Untracked:  .RDataTmp3
    Untracked:  3hr all.pdf
    Untracked:  Code_files_list.csv
    Untracked:  Data_files_list.csv
    Untracked:  Doxorubicin_vehicle_3_24.csv
    Untracked:  Doxtoplist.csv
    Untracked:  EPIqvalue_analysis.Rmd
    Untracked:  GWAS_list_of_interest.xlsx
    Untracked:  KEGGpathwaylist.R
    Untracked:  OmicNavigator_learn.R
    Untracked:  SigDoxtoplist.csv
    Untracked:  analysis/ciFIT.R
    Untracked:  analysis/export_to_excel.R
    Untracked:  cleanupfiles_script.R
    Untracked:  code/biomart_gene_names.R
    Untracked:  code/constantcode.R
    Untracked:  code/corMotifcustom.R
    Untracked:  code/cpm_boxplot.R
    Untracked:  code/extracting_ggplot_data.R
    Untracked:  code/movingfilesto_ppl.R
    Untracked:  code/pearson_extract_func.R
    Untracked:  code/pearson_tox_extract.R
    Untracked:  code/plot1C.fun.R
    Untracked:  code/spearman_extract_func.R
    Untracked:  code/venndiagramcolor_control.R
    Untracked:  cormotif_p.post.list_4.csv
    Untracked:  figS1024h.pdf
    Untracked:  individual-legenddark2.png
    Untracked:  installed_old.rda
    Untracked:  motif_ER.txt
    Untracked:  motif_LR.txt
    Untracked:  motif_NR.txt
    Untracked:  motif_TI.txt
    Untracked:  output/DNR_DEGlist.csv
    Untracked:  output/DNRvenn.RDS
    Untracked:  output/DOX_DEGlist.csv
    Untracked:  output/DOXvenn.RDS
    Untracked:  output/EPI_DEGlist.csv
    Untracked:  output/EPIvenn.RDS
    Untracked:  output/Figures/
    Untracked:  output/MTX_DEGlist.csv
    Untracked:  output/MTXvenn.RDS
    Untracked:  output/TRZ_DEGlist.csv
    Untracked:  output/TableS8.csv
    Untracked:  output/Volcanoplot_10
    Untracked:  output/Volcanoplot_10.RDS
    Untracked:  output/allfinal_sup10.RDS
    Untracked:  output/endocytosisgenes.csv
    Untracked:  output/gene_corr_fig9.RDS
    Untracked:  output/legend_b.RDS
    Untracked:  output/motif_ERrep.RDS
    Untracked:  output/motif_LRrep.RDS
    Untracked:  output/motif_NRrep.RDS
    Untracked:  output/motif_TI_rep.RDS
    Untracked:  output/output-old/
    Untracked:  output/rank24genes.csv
    Untracked:  output/rank3genes.csv
    Untracked:  output/reneem@ls6.tacc.utexas.edu
    Untracked:  output/sequencinginformationforsupp.csv
    Untracked:  output/sequencinginformationforsupp.prn
    Untracked:  output/sigVDA24.txt
    Untracked:  output/sigVDA3.txt
    Untracked:  output/sigVDX24.txt
    Untracked:  output/sigVDX3.txt
    Untracked:  output/sigVEP24.txt
    Untracked:  output/sigVEP3.txt
    Untracked:  output/sigVMT24.txt
    Untracked:  output/sigVMT3.txt
    Untracked:  output/sigVTR24.txt
    Untracked:  output/sigVTR3.txt
    Untracked:  output/supplementary_motif_list_GO.RDS
    Untracked:  output/toptablebydrug.RDS
    Untracked:  output/tvl24hour.txt
    Untracked:  output/x_counts.RDS
    Untracked:  reneebasecode.R

Unstaged changes:
    Modified:   analysis/GOI_plots.Rmd
    Modified:   analysis/LDH_analysis.Rmd
    Modified:   analysis/Supplementary_figures.Rmd
    Modified:   analysis/index.Rmd
    Modified:   output/daplot.RDS
    Modified:   output/dxplot.RDS
    Modified:   output/epplot.RDS
    Modified:   output/mtplot.RDS
    Modified:   output/plan2plot.png
    Modified:   output/trplot.RDS
    Modified:   output/veplot.RDS

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/run_all_analysis.Rmd) and HTML (docs/run_all_analysis.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	87a496b	reneeisnowhere	2023-09-28	updates to figure links and CSS script
html	eb35ace	reneeisnowhere	2023-09-26	Build site.
Rmd	ccc5b66	reneeisnowhere	2023-09-26	updated ref
html	38769b9	reneeisnowhere	2023-09-25	Build site.
html	b936755	reneeisnowhere	2023-09-25	Build site.
Rmd	5be5b86	reneeisnowhere	2023-09-25	added tracked files and css
html	aeeade4	reneeisnowhere	2023-09-25	Build site.
Rmd	e93b23c	reneeisnowhere	2023-09-25	updated code, try to add css
Rmd	27e6962	reneeisnowhere	2023-09-25	updated code, took out unneeded comments, turned on
Rmd	06800c9	reneeisnowhere	2023-07-26	Commits to small changes and edits
Rmd	4eac3d2	reneeisnowhere	2023-07-03	rearraged some lines, no changes
Rmd	f4bd5e1	reneeisnowhere	2023-06-27	checking code changes overtime
Rmd	326973b	reneeisnowhere	2023-06-26	updates on Volcano plot grid
Rmd	3257e1c	reneeisnowhere	2023-06-26	Updating savefile for efit2
html	8daa38a	reneeisnowhere	2023-05-26	Build site.
Rmd	f99a753	reneeisnowhere	2023-05-26	volcano plot limits added
Rmd	8d87a52	reneeisnowhere	2023-05-17	end of day
html	bdbf1c2	reneeisnowhere	2023-04-21	Build site.
Rmd	5a99763	reneeisnowhere	2023-04-21	correct graph to log 2!
html	1f4237c	reneeisnowhere	2023-04-20	Build site.
Rmd	13e7b3d	reneeisnowhere	2023-04-20	updatedsequences graph with bar at 20,000,000
html	5ef62c9	reneeisnowhere	2023-04-17	Build site.
Rmd	8a5a1e1	reneeisnowhere	2023-04-17	update html page
html	7f62d67	reneeisnowhere	2023-04-17	Build site.
Rmd	363ddad	reneeisnowhere	2023-04-17	update with GO links
html	21fd945	reneeisnowhere	2023-04-17	Build site.
Rmd	e1c63d9	reneeisnowhere	2023-04-17	wflow_publish("analysis/run_all_analysis.Rmd")
html	8221ec3	reneeisnowhere	2023-04-16	Build site.
Rmd	9e88c22	reneeisnowhere	2023-04-16	updated run data
Rmd	6d925a2	reneeisnowhere	2023-04-16	updating cormotif with updated RNAseq counts
html	8d08bd2	reneeisnowhere	2023-04-11	Build site.
Rmd	0aaa63d	reneeisnowhere	2023-04-11	cormotif analysis update
Rmd	575fd81	reneeisnowhere	2023-04-11	updating cormotif
html	4cd8ac4	reneeisnowhere	2023-04-11	Build site.
html	08936e7	reneeisnowhere	2023-04-10	Build site.
Rmd	fa2cbeb	reneeisnowhere	2023-04-10	monday end
html	85526c5	reneeisnowhere	2023-04-10	Build site.
Rmd	1444a85	reneeisnowhere	2023-04-10	update push of new data
html	b266b76	reneeisnowhere	2023-04-10	Build site.
Rmd	d3f8cf7	reneeisnowhere	2023-04-10	update push of new data
html	f0a75e1	reneeisnowhere	2023-04-10	Build site.
Rmd	8ca4c7e	reneeisnowhere	2023-04-10	first rmd commit
Rmd	2e69969	reneeisnowhere	2023-04-10	adding data
Rmd	0f1f1da	reneeisnowhere	2023-04-10	final run analysis

This starts the documentation of the RNA-seq cardiotoxicity analysis for my cardiotoxicity data.

library(edgeR)#
library(limma)#
library(RColorBrewer)
library(gridExtra)#
library(reshape2)#
library(data.table)
library(tidyverse)
library(scales)
library(biomaRt)#
library(cowplot)#
library(ggrepel)#
library(corrplot)
library(Hmisc)
library(ggpubr)

pca_plot <-
  function(df,
           col_var = NULL,
           shape_var = NULL,
           title = "") {
    ggplot(df) + geom_point(aes_string(
      x = "PC1",
      y = "PC2",
      color = col_var,
      shape = shape_var
    ),
    size = 5) +
      labs(title = title, x = "PC 1", y = "PC2") +
      scale_color_manual(values = c(
        "#8B006D",
        "#DF707E",
        "#F1B72B",
        "#3386DD",
        "#707031",
        "#41B333"
      ))
  }

pca_var_plot <- function(pca) {
  # x: class == prcomp
  pca.var <- pca$sdev ^ 2
  pca.prop <- pca.var / sum(pca.var)
  var.plot <-
    qplot(PC, prop, data = data.frame(PC = 1:length(pca.prop),
                                      prop = pca.prop)) +
    labs(title = 'Variance contributed by each PC',
         x = 'PC', y = 'Proportion of variance')
}

calc_pca <- function(x) {
  # Performs principal components analysis with prcomp
  # x: a sample-by-gene numeric matrix
  prcomp(x, scale. = TRUE, retx = TRUE)
}

get_regr_pval <- function(mod) {
  # Returns the p-value for the Fstatistic of a linear model
  # mod: class lm
  stopifnot(class(mod) == "lm")
  fstat <- summary(mod)$fstatistic
  pval <- 1 - pf(fstat[1], fstat[2], fstat[3])
  return(pval)
}

plot_versus_pc <- function(df, pc_num, fac) {
  # df: data.frame
  # pc_num: numeric, specific PC for plotting
  # fac: column name of df for plotting against PC
  pc_char <- paste0("PC", pc_num)
  # Calculate F-statistic p-value for linear model
  pval <- get_regr_pval(lm(df[, pc_char] ~ df[, fac]))
  if (is.numeric(df[, f])) {
    ggplot(df, aes_string(x = f, y = pc_char)) + geom_point() +
      geom_smooth(method = "lm") + labs(title = sprintf("p-val: %.2f", pval))
  } else {
    ggplot(df, aes_string(x = f, y = pc_char)) + geom_boxplot() +
      labs(title = sprintf("p-val: %.2f", pval))
  }
}
x_axis_labels = function(labels, every_nth = 1, ...) {
  axis(side = 1,
       at = seq_along(labels),
       labels = F)
  text(
    x = (seq_along(labels))[seq_len(every_nth) == 1],
    y = par("usr")[3] - 0.075 * (par("usr")[4] - par("usr")[3]),
    labels = labels[seq_len(every_nth) == 1],
    xpd = TRUE,
    ...
  )
}

Initial RNA-seq quality checks

drug_palc <- c("#8B006D","#DF707E","#F1B72B", "#3386DD","#707031","#41B333")
seq_info %>% 
  filter(type=="Total_reads") %>% 
  ggplot(., aes (x =samplenames, y=count, fill = drug, group_by=indv))+
  geom_col()+
 geom_hline(aes(yintercept=20000000))+
 scale_fill_manual(values=drug_palc)+
  ggtitle(expression("Total number of reads by sample"))+
  xlab("")+
  ylab(expression("RNA -sequencing reads"))+
  theme_bw()+
  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 90, hjust = 1, vjust = 0.2),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
1f4237c	reneeisnowhere	2023-04-20
8221ec3	reneeisnowhere	2023-04-16

seq_info %>% 
  filter(type=="Total_reads") %>% 
  ggplot(., aes (x =drug, y=count, fill = drug))+
  geom_boxplot()+
 scale_fill_manual(values=drug_palc)+
  ggtitle(expression("Total number of reads by treatment"))+
  xlab("")+
  ylab(expression("RNA -sequencing reads"))+
  theme_bw()+
  
  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 90, hjust = 1, vjust = 0.2),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
1f4237c	reneeisnowhere	2023-04-20
8221ec3	reneeisnowhere	2023-04-16

seq_info %>% 
  filter(type=="Total_reads") %>% 
  ggplot(., aes (x =as.factor(indv), y=count))+
  geom_boxplot(aes(fill=as.factor(indv)))+
 scale_fill_brewer(palette = "Dark2", name = "Individual")+
  ggtitle(expression("Total number of reads by individual"))+
  xlab("")+
  ylab(expression("RNA -sequencing reads"))+
  theme_bw()+

  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 0, hjust = 1),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
1f4237c	reneeisnowhere	2023-04-20
8221ec3	reneeisnowhere	2023-04-16

seq_info %>% 
  filter(type=="Mapped_reads") %>% 
  ggplot(., aes (x =samplenames, y=count, fill = drug, group_by=indv))+
  geom_col()+
  geom_hline(aes(yintercept=20000000))+
 scale_fill_manual(values=drug_palc)+
  ggtitle(expression("Total number of mapped reads by sample"))+
  xlab("")+
  ylab(expression("RNA -sequencing reads"))+
  theme_bw()+
  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 90, hjust = 1, vjust = 0.2),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
1f4237c	reneeisnowhere	2023-04-20
8221ec3	reneeisnowhere	2023-04-16

seq_info %>% 
  filter(type=="Mapped_reads") %>% 
  ggplot(., aes (x =as.factor(indv), y=count))+
  geom_boxplot(aes(fill=as.factor(indv)))+
 scale_fill_brewer(palette = "Dark2", name = "Individual")+
  ggtitle(expression("Total mapped reads by individual"))+
  xlab("")+
  ylab(expression("Number of reads"))+
  theme_bw()+

  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 0, hjust = 0.5),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
1f4237c	reneeisnowhere	2023-04-20
8221ec3	reneeisnowhere	2023-04-16

seq_info %>% 
  filter(type=="Mapped_reads") %>% 
  ggplot(., aes (x =drug, y=count))+
  geom_boxplot(aes(fill=drug))+
 scale_fill_manual(values=drug_palc)+
  ggtitle(expression("Total mapped reads by treatment"))+
  xlab("")+
  ylab(expression("Number of reads"))+
  theme_bw()+

  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text.y = element_text(size =10, color = "black", angle = 0, hjust = 0.8, vjust = 0.5),
        axis.text.x = element_text(size =10, color = "black", angle = 0, hjust = 0.5),
        #strip.text.x = element_text(size = 15, color = "black", face = "bold"),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
1f4237c	reneeisnowhere	2023-04-20

Initial histograms from count matrix

cpm <- cpm(mymatrix)
lcpm <- cpm(mymatrix, log=TRUE)  ### for determining the basic cutoffs
dim(cpm)

[1] 28395    72

row_means <- rowMeans(lcpm)
x <- mymatrix[row_means > 0,]
dim(x)

[1] 14084    72

filcpm_counts <- cpm(x$counts, log = TRUE)


label <- (interaction(drug, indv, time))
colnames(filcpm_counts) <- label

hist(lcpm,  main = "Histogram of total counts (unfiltered)", 
     xlab =expression("Log"[2]*" counts-per-million"), col =4 )

Version	Author	Date
bdbf1c2	reneeisnowhere	2023-04-21
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

hist(cpm(x$counts, log = TRUE), main = "Histogram of filtered counts using rowMeans > 0 method", 
     xlab =expression("Log"[2]*" counts-per-million"), col =2 )

Version	Author	Date
bdbf1c2	reneeisnowhere	2023-04-21
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

boxplot(lcpm, main = "Boxplots of log cpm per sample",xaxt = "n", xlab= "")
x_axis_labels(labels = label, every_nth = 1, adj=0.7, srt =90, cex =0.4)

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

boxplot(filcpm_counts, main ="boxplots of log cpm per sample filtered",xaxt = "n", xlab="")
x_axis_labels(labels = label, every_nth = 1, adj=0.7, srt =90, cex =0.4)

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

PCA by treatment and as a whole

smlabel <- (interaction(drug, time))
x <- calcNormFactors(x, method = "TMM")
normalized_lib_size <- x$samples$lib.size * x$samples$norm.factors
dat_cpm <- cpm(x$counts, lib.size = normalized_lib_size, log=TRUE)

colnames(dat_cpm) <- label
# saveRDS(dat_cpm,"data/dat_cpm.RDS")
anno$time <- ordered(time, levels =c("3h", "24h"))
anno$group <- group
cpm_per_sample <- cbind(anno, t(dat_cpm)) 
treattype <-c("VEH","DNR","DOX","EPI","MTX","TRZ")

for (b in treattype) { 
  dat <- dat_cpm[,anno$drug %in% c("none", b)] 
  cat("\n\n### ", b, "\n\n") 
  # PCA s
  pca <- calc_pca(t(dat))$x
  pca <- data.frame(anno[anno$drug %in% c("none", b), c("drug", "time","indv")], pca)
  pca <- droplevels(pca) 
  for (cat_var in c("drug", "time", "indv")) {
    assign(paste0(cat_var, "_pca"), arrangeGrob(pca_plot(pca, cat_var)))
  } 
  drug_time_pca <- pca_plot(pca, "drug", "time")
 
  grid.arrange(drug_time_pca, indv_pca,  nrow =2, top =paste(b)) 
  }



###  VEH

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10



###  DNR

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10



###  DOX

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10



###  EPI

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10



###  MTX

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10



###  TRZ

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

# write.csv(x$counts, "data/cpmnorm_counts.csv")

pca_all <- calc_pca(t(dat_cpm))
pca_all_anno <- data.frame(anno, pca_all$x)
varpals <- (pca_var_plot(pca_all))
print(varpals)

Version	Author	Date
8221ec3	reneeisnowhere	2023-04-16
f0a75e1	reneeisnowhere	2023-04-10

head(pca_all_anno)[,1:12]

         samplenames indv drug time RIN group       PC1      PC2       PC3
DNR.1.3h MCW_RM_R_11    1  DNR   3h 9.3     1 -18.33154 61.71013 44.039139
DOX.1.3h MCW_RM_R_12    1  DOX   3h 9.8     2 -12.36280 73.97678 24.576395
EPI.1.3h MCW_RM_R_13    1  EPI   3h 9.8     3 -11.16205 66.48794 33.025628
MTX.1.3h MCW_RM_R_14    1  MTX   3h  10     4 -10.19948 73.48343 19.016766
TRZ.1.3h MCW_RM_R_15    1  TRZ   3h 9.6     5 -12.17619 80.01454  2.640624
VEH.1.3h MCW_RM_R_16    1  VEH   3h 9.9     6 -14.98226 76.62199 12.706808
                PC4        PC5       PC6
DNR.1.3h  -4.547031  24.642107 -35.03245
DOX.1.3h  -8.626528 -19.908580 -18.97447
EPI.1.3h  -9.349549  18.083569 -43.06551
MTX.1.3h -14.639651  -9.065324 -24.29908
TRZ.1.3h -17.019296 -34.253925 -11.77881
VEH.1.3h  -4.173412 -39.846595 -17.16213

drug_palc <- c("#8B006D","#DF707E","#F1B72B", "#3386DD","#707031","#41B333")

pca_all_anno %>%
  ggplot(.,aes(x = PC1, y = PC2, col=drug, shape=time))+
  geom_point(size= 5)+
  scale_color_manual(values=drug_palc)+
   ggrepel::geom_text_repel(label=indv)+
  #scale_shape_manual(name = "Time",values= c("3h"=0,"24h"=1))+
  ggtitle(expression("PCA of log"[2]*"(cpm)"))+
  theme_bw()+
  guides(col="none", size =4)+
  labs(y = "PC 2 (15.76%)", x ="PC 1 (29.06%)")+
  theme(plot.title=element_text(size= 14,hjust = 0.5),
        axis.title = element_text(size = 12, color = "black"))

Version	Author	Date
eb35ace	reneeisnowhere	2023-09-26
38769b9	reneeisnowhere	2023-09-25
b936755	reneeisnowhere	2023-09-25
aeeade4	reneeisnowhere	2023-09-25
8daa38a	reneeisnowhere	2023-05-26
bdbf1c2	reneeisnowhere	2023-04-21
1f4237c	reneeisnowhere	2023-04-20
5ef62c9	reneeisnowhere	2023-04-17
7f62d67	reneeisnowhere	2023-04-17
21fd945	reneeisnowhere	2023-04-17
8221ec3	reneeisnowhere	2023-04-16
8d08bd2	reneeisnowhere	2023-04-11
4cd8ac4	reneeisnowhere	2023-04-11
08936e7	reneeisnowhere	2023-04-10
85526c5	reneeisnowhere	2023-04-10
b266b76	reneeisnowhere	2023-04-10
f0a75e1	reneeisnowhere	2023-04-10

pca_all_anno %>%
  ggplot(.,aes(x = PC2, y = PC3, col=drug, shape=time))+
  geom_point(size= 5)+
  scale_color_manual(values=drug_palc)+
  ggrepel::geom_text_repel(label=indv)+
  ggtitle(expression("PCA of log"[2]*"(cpm)"))+
  theme_bw()+
  guides( size =4)+
  labs(x = "PC 2 (15.76%)", y ="PC 3 (8.3%)")+
  theme(plot.title=element_text(size= 14,hjust = 0.5),
        axis.title = element_text(size = 12, color = "black"))

Version	Author	Date
eb35ace	reneeisnowhere	2023-09-26
38769b9	reneeisnowhere	2023-09-25
b936755	reneeisnowhere	2023-09-25
aeeade4	reneeisnowhere	2023-09-25
8daa38a	reneeisnowhere	2023-05-26
bdbf1c2	reneeisnowhere	2023-04-21
1f4237c	reneeisnowhere	2023-04-20
5ef62c9	reneeisnowhere	2023-04-17
7f62d67	reneeisnowhere	2023-04-17
21fd945	reneeisnowhere	2023-04-17
8221ec3	reneeisnowhere	2023-04-16
8d08bd2	reneeisnowhere	2023-04-11
4cd8ac4	reneeisnowhere	2023-04-11
08936e7	reneeisnowhere	2023-04-10
85526c5	reneeisnowhere	2023-04-10
b266b76	reneeisnowhere	2023-04-10
f0a75e1	reneeisnowhere	2023-04-10

prdat_cpm <- calc_pca(t(dat_cpm)) 
storepca <- summary(prdat_cpm)
head(storepca$importance[,1:7])

                            PC1      PC2      PC3      PC4      PC5      PC6
Standard deviation     63.98853 47.11608 34.21502 32.58775 28.22245 23.90977
Proportion of Variance  0.29072  0.15762  0.08312  0.07540  0.05655  0.04059
Cumulative Proportion   0.29072  0.44834  0.53146  0.60687  0.66342  0.70401
                            PC7
Standard deviation     21.56133
Proportion of Variance  0.03301
Cumulative Proportion   0.73702

Typical genes expressed in iPSC-CMS

genecardiccheck <- c("MYH7", "TNNT2","MYH6","ACTN2","BMP3","TNNI3","RYR2","CACNA1C","KCNQ1", "HCN1", "ADRB1", "ADRB2")
#ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
#saveRDS(ensembl, "data/ensembl_backup.RDS")
#ensemble <- readRDS("data/ensembl_backup.RDS")
#my_chr <- c(1:22, 'M', 'X', 'Y')  ## creates a filter for each database

#my_attributes <- c('entrezgene_id', 'ensembl_gene_id', 'hgnc_symbol')

#heartgenes <- getBM(attributes=my_attributes,filters ='hgnc_symbol',
                # values = genecardiccheck, mart = ensembl)
#write.csv(heartgenes, "data/heartgenes.csv")
heartgenes <-read.csv("data/heartgenes.csv")

fungraph <- as.data.frame(filcpm_counts[rownames(filcpm_counts) %in% heartgenes$entrezgene_id,])


fungraph %>% 
  rownames_to_column("entrezgene_id") %>% 
  pivot_longer(-entrezgene_id, names_to = "samples",values_to = "counts") %>% 
  mutate(gene = case_match(entrezgene_id,"88"~"ACTN2","153"~"ADRB1",
  "154"~"ADRB2","651"~"BMP3","775"~"CACNA1C", "100874369"~"CACNA1C","348980"~"HCN1",
                           "3784"~"KCNQ1", "4624"~"MYH6","4625"~"MYH7","6262"~"RYR2",
                           "7137"~"TNNI3","7139"~"TNNT2",.default = entrezgene_id)) %>% 
  ggplot(., aes(x=reorder(gene,counts,decreasing=TRUE), y=counts))+
  geom_boxplot()+
  ggtitle(expression("Expression of typical cardiac tissue genes"))+
  xlab("")+
  ylab(expression("log"[2]~"cpm"))+
  theme_bw()+
  theme(plot.title = element_text(size = rel(2), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        axis.text = element_text(size =10, color = "black", angle = 0),
        strip.text.y = element_text(color = "white"))

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
b266b76	reneeisnowhere	2023-04-10

correlation heatmap of counts matrix

colnames(filcpm_counts) <- label
# saveRDS(filcpm_counts,"data/filcpm_counts.RDS")
mcort <- cor(filcpm_counts)
pheatmap::pheatmap(mcort , cluster_rows = TRUE, cluster_cols = TRUE)

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8221ec3	reneeisnowhere	2023-04-16
85526c5	reneeisnowhere	2023-04-10

now to get the counts set for DEG!!

group1 <- interaction(drug,time)
mm <- model.matrix(~0 + group1)
colnames(mm) <- c("A3", "X3", "E3","M3","T3", "V3","A24", "X24", "E24","M24","T24", "V24")
y <- voom(x, mm,plot =TRUE)

corfit <- duplicateCorrelation(y, mm, block = indv)

v <- voom(x, mm, block = indv, correlation = corfit$consensus)

fit <- lmFit(v, mm, block = indv, correlation = corfit$consensus)

cm <- makeContrasts(
  V.DA = V3 - A3,
  V.DX = V3 - X3,
  V.EP = V3 - E3,
  V.MT = V3 - M3,
  V.TR = V3 - T3,
  V.DA24 = V24-A24,
  V.DX24= V24-X24,
  V.EP24= V24-E24,
  V.MT24= V24-M24,
  V.TR24= V24-T24,
  levels = mm)

vfit <- lmFit(y, mm)

vfit<- contrasts.fit(vfit, contrasts=cm)

efit2 <- eBayes(vfit)
# saveRDS(efit2,"data/efit2_final.RDS")

Pairwise DEG analysis

summary

sum <- summary(decideTests(efit2))

sum

        V.DA  V.DX  V.EP  V.MT  V.TR V.DA24 V.DX24 V.EP24 V.MT24 V.TR24
Down     109     3    30    24     0   3540   3336   3105    428      0
NotSig 13552 14065 13874 14009 14084   7067   7439   7756  12969  14084
Up       423    16   180    51     0   3477   3309   3223    687      0

Volcano plots from pairwise gene analysis

siglist_final <- readRDS("data/siglist_final.RDS")
list2env(siglist_final,envir = .GlobalEnv)

<environment: R_GlobalEnv>

volcanosig <- function(df, psig.lvl,topg) {
    df <- df %>% 
    mutate(threshold = ifelse(adj.P.Val > psig.lvl, "A", ifelse(adj.P.Val <= psig.lvl & logFC<=0,"B","C")))
    
  ggplot(df, aes(x=logFC, y=-log10(adj.P.Val))) + 
    geom_point(aes(color=threshold))+
    xlab(expression("Log"[2]*" FC"))+
    ylim(0,30)+
    ylab(expression("-log"[10]*"P Value"))+
    scale_color_manual(values = c("black", "red","blue"))+
    theme_cowplot()+
    theme(legend.position = "none",
          plot.title = element_text(size = rel(0.8), hjust = 0.5),
          axis.title = element_text(size = rel(0.8))) 
}

v1 <- volcanosig(V.DA.top, 0.01,0)+ ggtitle("Daunorubicin\n 3 hour")
v2 <- volcanosig(V.DA24.top, 0.01,0)+ ggtitle("Daunorubicin\n 24 hour")
v3 <- volcanosig(V.DX.top, 0.01,0)+ ggtitle("Doxorubicin\n 3 hour")+ylab("")
v4 <- volcanosig(V.DX24.top, 0.01,0)+ ggtitle("Doxorubicin\n 24 hour")+ylab("")
v5 <- volcanosig(V.EP.top, 0.01,0)+ ggtitle("Epirubicin\n 3 hour")+ylab("")
v6 <- volcanosig(V.EP24.top, 0.01,0)+ ggtitle("Epirubicin\n 24 hour")+ylab("")
v7 <- volcanosig(V.MT.top, 0.01,0)+ ggtitle("Mitoxatrone\n 3 hour")+ylab("")
v8 <- volcanosig(V.MT24.top, 0.01,0)+ ggtitle("Mitoxatrone\n 24 hour")+ylab("")
v9 <- volcanosig(V.TR.top, 0.01,0)+ ggtitle("Trastuzumab\n 3 hour")+ylab("")
v10 <- volcanosig(V.TR24.top, 0.01,0)+ ggtitle("Trastuzumab\n 24 hour")+ylab("")

Volcanoplots <- plot_grid(v1,v3,v5,v7,v9,v2,v4,v6,v8,v10, nrow = 2, ncol = 5)
Volcanoplots

Version	Author	Date
aeeade4	reneeisnowhere	2023-09-25
8daa38a	reneeisnowhere	2023-05-26
21fd945	reneeisnowhere	2023-04-17

# saveRDS(Volcanoplots,"output/Volcanoplot_10.RDS")

sessionInfo()

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggpubr_0.6.0       Hmisc_5.1-1        corrplot_0.92      ggrepel_0.9.3     
 [5] cowplot_1.1.1      biomaRt_2.56.1     scales_1.2.1       lubridate_1.9.2   
 [9] forcats_1.0.0      stringr_1.5.0      dplyr_1.1.3        purrr_1.0.2       
[13] readr_2.1.4        tidyr_1.3.0        tibble_3.2.1       ggplot2_3.4.3     
[17] tidyverse_2.0.0    data.table_1.14.8  reshape2_1.4.4     gridExtra_2.3     
[21] RColorBrewer_1.1-3 edgeR_3.42.4       limma_3.56.2       workflowr_1.7.1   

loaded via a namespace (and not attached):
  [1] DBI_1.1.3               bitops_1.0-7            rlang_1.1.1            
  [4] magrittr_2.0.3          git2r_0.32.0            compiler_4.3.1         
  [7] RSQLite_2.3.1           getPass_0.2-2           png_0.1-8              
 [10] callr_3.7.3             vctrs_0.6.3             pkgconfig_2.0.3        
 [13] crayon_1.5.2            fastmap_1.1.1           backports_1.4.1        
 [16] dbplyr_2.3.3            XVector_0.40.0          labeling_0.4.3         
 [19] utf8_1.2.3              promises_1.2.1          rmarkdown_2.24         
 [22] tzdb_0.4.0              ps_1.7.5                bit_4.0.5              
 [25] xfun_0.40               zlibbioc_1.46.0         cachem_1.0.8           
 [28] GenomeInfoDb_1.36.1     jsonlite_1.8.7          progress_1.2.2         
 [31] blob_1.2.4              later_1.3.1             broom_1.0.5            
 [34] cluster_2.1.4           prettyunits_1.1.1       R6_2.5.1               
 [37] bslib_0.5.1             stringi_1.7.12          car_3.1-2              
 [40] rpart_4.1.19            jquerylib_0.1.4         Rcpp_1.0.11            
 [43] knitr_1.44              base64enc_0.1-3         IRanges_2.34.1         
 [46] nnet_7.3-19             httpuv_1.6.11           timechange_0.2.0       
 [49] tidyselect_1.2.0        abind_1.4-5             rstudioapi_0.15.0      
 [52] yaml_2.3.7              curl_5.0.2              processx_3.8.2         
 [55] lattice_0.21-8          plyr_1.8.8              Biobase_2.60.0         
 [58] withr_2.5.0             KEGGREST_1.40.0         evaluate_0.21          
 [61] foreign_0.8-85          BiocFileCache_2.8.0     xml2_1.3.5             
 [64] Biostrings_2.68.1       pillar_1.9.0            filelock_1.0.2         
 [67] carData_3.0-5           whisker_0.4.1           checkmate_2.2.0        
 [70] stats4_4.3.1            generics_0.1.3          rprojroot_2.0.3        
 [73] RCurl_1.98-1.12         S4Vectors_0.38.1        hms_1.1.3              
 [76] munsell_0.5.0           glue_1.6.2              pheatmap_1.0.12        
 [79] tools_4.3.1             ggsignif_0.6.4          locfit_1.5-9.8         
 [82] fs_1.6.3                XML_3.99-0.14           grid_4.3.1             
 [85] AnnotationDbi_1.62.2    colorspace_2.1-0        GenomeInfoDbData_1.2.10
 [88] htmlTable_2.4.1         Formula_1.2-5           cli_3.6.1              
 [91] rappdirs_0.3.3          fansi_1.0.4             gtable_0.3.4           
 [94] rstatix_0.7.2           sass_0.4.7              digest_0.6.33          
 [97] BiocGenerics_0.46.0     farver_2.1.1            htmlwidgets_1.6.2      
[100] memoise_2.0.1           htmltools_0.5.6         lifecycle_1.0.3        
[103] httr_1.4.7              bit64_4.0.5

RNA-seq analysis, Pairwise

ERM

20230414