Last updated: 2021-02-19

Checks: 6 1

Knit directory: Human_Development_ATACseq_bulk/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210216) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv
/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/ATACseq_samplesheet.txt output/ATACseq_samplesheet.txt
/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/logCPM_humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv output/logCPM_humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 294d830. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Untracked files:
    Untracked:  *.noXYMT.bed.tidy.bed
    Untracked:  *xls.bg.bed
    Untracked:  *xls.dn.bed
    Untracked:  *xls.up.bed
    Untracked:  Development_noXY.jn.rnk
    Untracked:  FetalvsYoung_noXY.jn.rnk
    Untracked:  Homo_sapiens.GRCh38.96.fulllength.saf
    Untracked:  YoungvsAdult_noXY.jn.rnk
    Untracked:  analysis/*.dn.bed.homeranno.txt
    Untracked:  analysis/*.up.bed.homeranno.txt
    Untracked:  analysis/00.WorkFlowR_setting.R
    Untracked:  code/EnDrich.R
    Untracked:  code/EnDrichProc_Development_noXY.R
    Untracked:  code/EnDrichProc_FetalvsYoung_noXY.R
    Untracked:  code/EnDrichProc_YoungvsAdult_noXY.R
    Untracked:  header.sam
    Untracked:  humanATAC*bed.saf
    Untracked:  humanATAC*bed.saf.pe.q30.mx
    Untracked:  humanATAC*bed.saf.pe.q30.mx.all
    Untracked:  humanATAC*bed.saf.pe.q30.mx.all.fix
    Untracked:  humanATAC*bed.saf.pe.q30.mx.chr
    Untracked:  humanATAC*bed.saf.pe.q30.mx.fix
    Untracked:  humanATAC*bed.saf.pe.q30.mx.hum.fix
    Untracked:  output/20190801_ATAC_samplesheet.txt
    Untracked:  output/ATACseq_samplesheet.txt
    Untracked:  output/atac_hum_tss_pe_mapk30_q30.mx.all_unfiltered.csv
    Untracked:  output/atac_hum_tss_pe_mapk30_q30.mx.chr
    Untracked:  output/atac_hum_tss_pe_mapk30_q30.mx.hum.fix_filt.csv
    Untracked:  output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.MvsF.fix_filt.csv
    Untracked:  output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv
    Untracked:  output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all_unfiltered.csv
    Untracked:  output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.hum.fix_filt.csv
    Untracked:  output/logCPM_humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv

Unstaged changes:
    Modified:   analysis/about.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/license.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/04.QC_and_CPM_Peaks.Rmd) and HTML (docs/04.QC_and_CPM_Peaks.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 294d830 evangelynsim 2021-02-19 wflow_publish(c(“analysis/01.Generate_reference_genome.Rmd”,

Introduction

In the GEO submission, 4 processed files (peaks) were uploaded.

  1. humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all_unfiltered.csv
  2. humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv
  3. humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.hum.fix_filt.csv
  4. humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.MvsF.fix_filt.csv

They have been uploaded in the /output folder and will be used below to generate different figures.

Used libraries and functions

library(edgeR)
Loading required package: limma
library(limma)
library(Glimma)
library(gplots)

Attaching package: 'gplots'
The following object is masked from 'package:stats':

    lowess

Count per million reads

rm1 <- read.csv("/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv", row.names = 1)

info = read.delim("/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/ATACseq_samplesheet.txt", header = TRUE, sep = "\t")

m = match(info$ID,names(rm1))
rm2 = rm1[,m]
rm1 = rm2

mycpm = cpm(rm1)

summary(mycpm)
     Fetal1             Fetal2             Fetal3             Young1        
 Min.   :   0.000   Min.   :   0.000   Min.   :   0.000   Min.   :   0.000  
 1st Qu.:   3.148   1st Qu.:   2.700   1st Qu.:   3.038   1st Qu.:   2.512  
 Median :   5.438   Median :   4.801   Median :   5.298   Median :   4.742  
 Mean   :  10.230   Mean   :  10.230   Mean   :  10.230   Mean   :  10.230  
 3rd Qu.:  11.735   3rd Qu.:  11.202   3rd Qu.:  11.585   3rd Qu.:  11.556  
 Max.   :2027.691   Max.   :2324.248   Max.   :1723.871   Max.   :1113.559  
     Young2             Young3             Young4             Adult1       
 Min.   :   0.000   Min.   :   0.000   Min.   :   0.000   Min.   :  0.000  
 1st Qu.:   3.235   1st Qu.:   2.798   1st Qu.:   2.379   1st Qu.:  2.075  
 Median :   5.475   Median :   4.959   Median :   4.559   Median :  4.018  
 Mean   :  10.230   Mean   :  10.230   Mean   :  10.230   Mean   : 10.230  
 3rd Qu.:  11.780   3rd Qu.:  11.508   3rd Qu.:  11.365   3rd Qu.: 10.508  
 Max.   :2363.268   Max.   :2413.537   Max.   :1304.316   Max.   :995.980  
     Adult2             Adult3             Adult4            Adult5        
 Min.   :   0.000   Min.   :   0.000   Min.   :  0.000   Min.   :   0.000  
 1st Qu.:   2.575   1st Qu.:   3.766   1st Qu.:  2.536   1st Qu.:   2.686  
 Median :   4.875   Median :   6.162   Median :  4.590   Median :   4.667  
 Mean   :  10.230   Mean   :  10.230   Mean   : 10.230   Mean   :  10.230  
 3rd Qu.:  11.619   3rd Qu.:  12.051   3rd Qu.: 11.017   3rd Qu.:  10.963  
 Max.   :1665.484   Max.   :4331.669   Max.   :968.910   Max.   :1521.759  
     Adult6             Adult7            Adult8             Adult9        
 Min.   :   0.000   Min.   :  0.000   Min.   :   0.000   Min.   :   0.000  
 1st Qu.:   2.126   1st Qu.:  2.006   1st Qu.:   2.509   1st Qu.:   2.030  
 Median :   4.285   Median :  4.043   Median :   4.460   Median :   4.228  
 Mean   :  10.230   Mean   : 10.230   Mean   :  10.230   Mean   :  10.230  
 3rd Qu.:  10.953   3rd Qu.: 10.707   3rd Qu.:  10.872   3rd Qu.:  11.154  
 Max.   :1282.877   Max.   :834.229   Max.   :1254.172   Max.   :1053.526  
    Adult10            Adult11            Adult12            Adult13        
 Min.   :   0.000   Min.   :   0.000   Min.   :   0.000   Min.   :   0.000  
 1st Qu.:   2.337   1st Qu.:   2.640   1st Qu.:   2.332   1st Qu.:   2.963  
 Median :   4.347   Median :   4.675   Median :   4.296   Median :   5.024  
 Mean   :  10.230   Mean   :  10.230   Mean   :  10.230   Mean   :  10.230  
 3rd Qu.:  10.704   3rd Qu.:  11.037   3rd Qu.:  10.616   3rd Qu.:  11.142  
 Max.   :1948.646   Max.   :1283.103   Max.   :2013.372   Max.   :2154.891  
    hiPSCCM1           hiPSCCM2           hiPSCCM3           hiPSCCM4       
 Min.   :   0.000   Min.   :   0.000   Min.   :   0.000   Min.   :   0.000  
 1st Qu.:   4.511   1st Qu.:   4.616   1st Qu.:   3.846   1st Qu.:   4.666  
 Median :   6.626   Median :   6.653   Median :   6.025   Median :   6.820  
 Mean   :  10.230   Mean   :  10.230   Mean   :  10.230   Mean   :  10.230  
 3rd Qu.:  11.843   3rd Qu.:  11.676   3rd Qu.:  11.409   3rd Qu.:  11.666  
 Max.   :7495.477   Max.   :9096.587   Max.   :5379.331   Max.   :9059.116  
    hiPSCCM5           hiPSCCM6           hiPSCCM7       
 Min.   :   0.000   Min.   :   0.000   Min.   :   0.000  
 1st Qu.:   4.689   1st Qu.:   4.010   1st Qu.:   4.427  
 Median :   7.034   Median :   6.626   Median :   6.788  
 Mean   :  10.230   Mean   :  10.230   Mean   :  10.230  
 3rd Qu.:  11.904   3rd Qu.:  11.683   3rd Qu.:  11.804  
 Max.   :6804.441   Max.   :5309.122   Max.   :5240.105  
x <- DGEList(rm1)

names(x)
[1] "counts"  "samples"
logcountsx = cpm(x, log = T)
write.csv(logcountsx, file = "/group/card2/Evangelyn_Sim/Transcriptome_chromatin_human/Sequencing_ATAC_RNA/GITHUB/Human_Development_ATACseq_bulk/output/logCPM_humanATAC_peaks_cov2_rmBL.bed.saf.pe.q30.mx.all.fix_filt.csv")

barplot(x$samples$lib.size, names=colnames(x), las=2, col = c("turquoise1","maroon1","bisque1","purple")[info$Group], main = "Library size")

boxplot(logcountsx, xlab="", ylab="Log2 counts per million", las=2, col = c("turquoise1","maroon1","bisque1","purple")[info$Group])
abline(h=median(logcountsx), col="navy")


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /hpc/software/installed/R/3.6.1/lib64/R/lib/libRblas.so
LAPACK: /hpc/software/installed/R/3.6.1/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] gplots_3.1.0    Glimma_1.12.0   edgeR_3.26.8    limma_3.40.6   
[5] workflowr_1.6.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5         pillar_1.4.6       compiler_3.6.1     later_1.1.0.1     
 [5] git2r_0.27.1       highr_0.8          bitops_1.0-6       tools_3.6.1       
 [9] digest_0.6.27      jsonlite_1.7.0     evaluate_0.14      lifecycle_0.2.0   
[13] tibble_3.0.3       lattice_0.20-41    pkgconfig_2.0.3    rlang_0.4.7       
[17] rstudioapi_0.11    yaml_2.2.1         xfun_0.18          stringr_1.4.0     
[21] knitr_1.30         caTools_1.18.0     gtools_3.8.2       fs_1.5.0          
[25] vctrs_0.3.2        locfit_1.5-9.4     rprojroot_1.3-2    grid_3.6.1        
[29] glue_1.4.2         R6_2.5.0           rmarkdown_2.5      magrittr_1.5      
[33] whisker_0.4        backports_1.1.10   promises_1.1.1     ellipsis_0.3.1    
[37] htmltools_0.5.0    httpuv_1.5.4       KernSmooth_2.23-17 stringi_1.5.3     
[41] crayon_1.3.4