Skip to content

Commit

Permalink
Merge pull request #404 from JoseEspinosa/updates
Browse files Browse the repository at this point in the history
Switch from macs2 to macs3
  • Loading branch information
JoseEspinosa authored Jul 23, 2024
2 parents 637deb5 + 7238956 commit 9fe5767
Show file tree
Hide file tree
Showing 29 changed files with 684 additions and 149 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#385](https://github.com/nf-core/chipseq/issues/385)] - Fix `--save_unaligned` description in schema.
- [[PR #392](https://github.com/nf-core/chipseq/pull/392)] - Adding line numbers to warnings/errors messages in `bin/check_samplesheet.py`.
- [[#396](https://github.com/nf-core/chipseq/issues/396)] - Check that samplesheet samples IDs do only have alphanumeric characters, dots, dashes or underscores.
- [[#378](https://github.com/nf-core/chipseq/issues/378)] - Switch from macs2 to macs3.

### Software dependencies

Expand All @@ -35,6 +36,8 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi
| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| `chromap` | 0.2.1 | 0.2.4 |
| `macs2` | 2.2.7.1 | |
| `macs3` | | 3.0.1 |
| `multiqc` | 1.13 | 1.14 |
| `picard` | 2.27.4 | 3.0.0 |
| `samtools` | 1.15.1 | 1.17 |
Expand Down
2 changes: 1 addition & 1 deletion CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@

> Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010 May 28;38(4):576-89. doi: 10.1016/j.molcel.2010.05.004. PubMed PMID: 20513432; PubMed Central PMCID: PMC2898526.
- [MACS2](https://www.ncbi.nlm.nih.gov/pubmed/18798982/)
- [MACS3](https://www.ncbi.nlm.nih.gov/pubmed/18798982/)

> Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137. doi: 10.1186/gb-2008-9-9-r137. Epub 2008 Sep 17. PubMed PMID: 18798982; PubMed Central PMCID: PMC2592715.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ You can find numerous talks on the [nf-core events page](https://nf-co.re/events
5. Generate gene-body meta-profile from bigWig files ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html))
6. Calculate genome-wide IP enrichment relative to control ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html))
7. Calculate strand cross-correlation peak and ChIP-seq quality measures including NSC and RSC ([`phantompeakqualtools`](https://github.com/kundajelab/phantompeakqualtools))
8. Call broad/narrow peaks ([`MACS2`](https://github.com/macs3-project/MACS))
8. Call broad/narrow peaks ([`MACS3`](https://github.com/macs3-project/MACS))
9. Annotate peaks relative to gene features ([`HOMER`](http://homer.ucsd.edu/homer/download.html))
10. Create consensus peakset across all samples and create tabular file to aid in the filtering of the data ([`BEDTools`](https://github.com/arq5x/bedtools2/))
11. Count reads in consensus peaks ([`featureCounts`](http://bioinf.wehi.edu.au/featureCounts/))
Expand Down
4 changes: 2 additions & 2 deletions assets/multiqc/frip_score_header.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#id: 'frip_score'
#section_name: 'MERGED LIB: MACS2 FRiP score'
#section_name: 'MERGED LIB: MACS3 FRiP score'
#description: "is generated by calculating the fraction of all mapped reads that fall
# into the MACS2 called peak regions. A read must overlap a peak by at least 20% to be counted.
# into the MACS3 called peak regions. A read must overlap a peak by at least 20% to be counted.
# See <a href='https://www.encodeproject.org/data-standards/terms/' target='_blank'>FRiP score</a>."
#plot_type: 'bargraph'
#anchor: 'frip_score'
Expand Down
4 changes: 2 additions & 2 deletions assets/multiqc/peak_count_header.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#id: 'peak_count'
#section_name: 'MERGED LIB: MACS2 peak count'
#section_name: 'MERGED LIB: MACS3 peak count'
#description: "is calculated from total number of peaks called by
# <a href='https://github.com/taoliu/MACS' target='_blank'>MACS2</a>"
# <a href='https://github.com/taoliu/MACS' target='_blank'>MACS3</a>"
#plot_type: 'bargraph'
#anchor: 'peak_count'
#pconfig:
Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ module_order:
anchor: "mlib_featurecounts"
info: "This section of the report shows featureCounts results for the number of reads assigned to merged library consensus peaks."
path_filters:
- "./macs2/featurecounts/*.summary"
- "./macs3/featurecounts/*.summary"

report_section_order:
peak_count:
Expand Down
19 changes: 17 additions & 2 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,30 @@
}
]
},
"replicate": {
"type": "integer",
"errorMessage": "Replicate id not an integer!",
"meta": ["replicate"]
},
"antibody": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Antibody entry cannot contain spaces"
"errorMessage": "Antibody entry cannot contain spaces",
"dependentRequired": ["control"],
"meta": ["antibody"]
},
"control": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Control entry cannot contain spaces"
"errorMessage": "Control entry cannot contain spaces",
"dependentRequired": ["antibody", "control_replicate"],
"meta": ["control"]
},
"control_replicate": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Control entry cannot contain spaces",
"meta": ["control_replicate"]
}
},
"required": ["sample", "fastq_1"]
Expand Down
10 changes: 5 additions & 5 deletions bin/macs2_merged_expand.py → bin/macs3_merged_expand.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@
############################################

Description = "Add sample boolean files and aggregate columns from merged MACS narrow or broad peak file."
Epilog = """Example usage: python macs2_merged_expand.py <MERGED_INTERVAL_FILE> <SAMPLE_NAME_LIST> <OUTFILE> --is_narrow_peak --min_replicates 1"""
Epilog = """Example usage: python macs3_merged_expand.py <MERGED_INTERVAL_FILE> <SAMPLE_NAME_LIST> <OUTFILE> --is_narrow_peak --min_replicates 1"""

argParser = argparse.ArgumentParser(description=Description, epilog=Epilog)

## REQUIRED PARAMETERS
argParser.add_argument("MERGED_INTERVAL_FILE", help="Merged MACS2 interval file created using linux sort and mergeBed.")
argParser.add_argument("MERGED_INTERVAL_FILE", help="Merged MACS3 interval file created using linux sort and mergeBed.")
argParser.add_argument(
"SAMPLE_NAME_LIST",
help="Comma-separated list of sample names as named in individual MACS2 broadPeak/narrowPeak output file e.g. SAMPLE_R1 for SAMPLE_R1_peak_1.",
help="Comma-separated list of sample names as named in individual MACS3 broadPeak/narrowPeak output file e.g. SAMPLE_R1 for SAMPLE_R1_peak_1.",
)
argParser.add_argument("OUTFILE", help="Full path to output directory.")

Expand Down Expand Up @@ -76,7 +76,7 @@ def makedir(path):
## sort -k1,1 -k2,2n <MACS_NARROWPEAK_FILE_LIST> | mergeBed -c 2,3,4,5,6,7,8,9,10 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > merged_peaks.txt


def macs2_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow=False, minReplicates=1):
def macs3_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow=False, minReplicates=1):
makedir(os.path.dirname(OutFile))

combFreqDict = {}
Expand Down Expand Up @@ -208,7 +208,7 @@ def macs2_merged_expand(MergedIntervalTxtFile, SampleNameList, OutFile, isNarrow
############################################
############################################

macs2_merged_expand(
macs3_merged_expand(
MergedIntervalTxtFile=args.MERGED_INTERVAL_FILE,
SampleNameList=args.SAMPLE_NAME_LIST.split(","),
OutFile=args.OUTFILE,
Expand Down
2 changes: 1 addition & 1 deletion bin/plot_macs2_qc.r → bin/plot_macs3_qc.r
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ library(scales)
option_list <- list(make_option(c("-i", "--peak_files"), type="character", default=NULL, help="Comma-separated list of peak files.", metavar="path"),
make_option(c("-s", "--sample_ids"), type="character", default=NULL, help="Comma-separated list of sample ids associated with peak files. Must be unique and in same order as peaks files input.", metavar="string"),
make_option(c("-o", "--outdir"), type="character", default='./', help="Output directory", metavar="path"),
make_option(c("-p", "--outprefix"), type="character", default='macs2_peakqc', help="Output prefix", metavar="string"))
make_option(c("-p", "--outprefix"), type="character", default='macs3_peakqc', help="Output prefix", metavar="string"))

opt_parser <- OptionParser(option_list=option_list)
opt <- parse_args(opt_parser)
Expand Down
34 changes: 17 additions & 17 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -540,7 +540,7 @@ if (!params.skip_plot_fingerprint) {
}

process {
withName: 'MACS2_CALLPEAK' {
withName: 'MACS3_CALLPEAK' {
ext.args = [
'--keep-dup all',
params.narrow_peak ? '' : "--broad --broad-cutoff ${params.broad_cutoff}",
Expand All @@ -550,7 +550,7 @@ process {
params.aligner == "chromap" ? "--format BAM" : '' //TODO check if not needed anymore with new chromap versions
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -559,14 +559,14 @@ process {
withName: 'FRIP_SCORE' {
ext.args = '-bed -c -f 0.20'
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
enabled: false
]
}

withName: 'MULTIQC_CUSTOM_PEAKS' {
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -575,11 +575,11 @@ process {

if (!params.skip_peak_annotation) {
process {
withName: '.*:BAM_PEAKS_CALL_QC_ANNOTATE_MACS2_HOMER:HOMER_ANNOTATEPEAKS' {
withName: '.*:BAM_PEAKS_CALL_QC_ANNOTATE_MACS3_HOMER:HOMER_ANNOTATEPEAKS' {
ext.args = '-gid'
ext.prefix = { "${meta.id}_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -588,20 +588,20 @@ if (!params.skip_peak_annotation) {

if (!params.skip_peak_qc) {
process {
withName: 'PLOT_MACS2_QC' {
ext.args = '-o ./ -p macs2_peak'
withName: 'PLOT_MACS3_QC' {
ext.args = '-o ./ -p macs3_peak'
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'PLOT_HOMER_ANNOTATEPEAKS' {
ext.args = '-o ./'
ext.prefix = 'macs2_annotatePeaks'
ext.prefix = 'macs3_annotatePeaks'
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/qc" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -612,11 +612,11 @@ if (!params.skip_peak_annotation) {

if (!params.skip_consensus_peaks) {
process {
withName: 'MACS2_CONSENSUS' {
withName: 'MACS3_CONSENSUS' {
ext.when = { meta.multiple_groups || meta.replicates_exist }
ext.prefix = { "${meta.id}.consensus_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -626,7 +626,7 @@ if (!params.skip_consensus_peaks) {
ext.args = '-F SAF -O --fracOverlap 0.2'
ext.prefix = { "${meta.id}.consensus_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -639,7 +639,7 @@ if (!params.skip_consensus_peaks) {
ext.args = '-gid'
ext.prefix = { "${meta.id}.consensus_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -648,7 +648,7 @@ if (!params.skip_consensus_peaks) {
withName: 'ANNOTATE_BOOLEAN_PEAKS' {
ext.prefix = { "${meta.id}.consensus_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -668,7 +668,7 @@ if (!params.skip_consensus_peaks) {
].join(' ').trim()
ext.prefix = { "${meta.id}.consensus_peaks" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/merged_library/macs2/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}/deseq2" },
path: { "${params.outdir}/${params.aligner}/merged_library/macs3/${params.narrow_peak ? 'narrow_peak' : 'broad_peak'}/consensus/${meta.id}/deseq2" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand Down
Loading

0 comments on commit 9fe5767

Please sign in to comment.