Skip to content

Commit

Permalink
Merge branch 'develop' into prerelease
Browse files Browse the repository at this point in the history
  • Loading branch information
SeanGolez committed Oct 3, 2024
2 parents ec1ae70 + b0db6f1 commit 64d9cce
Show file tree
Hide file tree
Showing 164 changed files with 718,360 additions and 266,530 deletions.
1 change: 1 addition & 0 deletions .github/workflows/cppCI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,4 @@ jobs:
run: |
cd build
./pepsirf_test
cp ../test/test_demux_output.tsv ../
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [1.7.0] - 2024-10-3

- #254, added the ability to run PepSIRF as a Docker image and added a page for instructions
- #197, resolved CMake not locating OpenMP on MacOS. Tutorial for fix added to installation page.
- #236, added a functionality to the "-i" option in Subjoin to accept a regex pattern instead of a filename which contains sample/peptide names. The sample/peptide names used from the score matrix file will be filtered by whether they contain the regex pattern.
- #234, added "--unmapped-reads-output" option to Demux, which writes all reads that have not been mapped to a sample/peptide to the specified filename.
- #233, changed Deconv "-t" option to accept a tab demilited file with a column for each TaxID and a column for the score threshold to use for that TaxID. The originally functionality still holds: if a number to included with option, each TaxID will use that score threshold.
- #227, Demux outputs additional information about the total number of samples, the number of samples containing a given number of replicates, and the number of samples starting with "Sblk_". The replicate information with be written to the file provided with the option "--replicate_info".
- #223, Added "--exclude" option to subjoin that changes the output data file to contain all of the input samples/peptides except the the ones specified by the user.
- #221, Demux automatically truncates sequences in the library which are longer the than provided length through the "--seq" option. If a sequence is found to be shorter than the specified length, an error is thrown.
- #218, Added "--custom_id_name_map_info" option to Deconv which accepts a filename, the key column header, and the value column header in the file to use to link TaxIDs to taxon names. This option should be used instead of "--id_name_map" if the user wishes to define a tab-delimited ID name map.
- #210, Fixes crash in Link when a species does not have an associated ID. A single warning is logged which informs the user some species have not been considered and where to find a list of those species which should be reviewed.
- #152, Automated tests have been added and finished to test all recently added features and fixed issues in PepSIRF.
- #131, Provides more information in Enrich's failed enrichment output. Sample replicates which do not meet either threshold are identified in the output and are marked as either not meeting the minimum or maximum threshold.
- #56, Alters behavior of Demux when ran in reference independent mode. In ref-independent mode, index toggling is turned off; therefore, if an exact match at the given index is not found, the read is discarded.
- #2, Adds a system to handle logging PepSIRF's progress when running. A default file name is automatically generated with the module name, current time and date. An option '--logfile' which allows the user to provide a custom name for the log file.
- #36, Standardizes the order tied species are listed in Deconv output. If species names are provided, then the tied species are sorted by alphabeticall by their names; otherwise, they are sorted by their species ID.

## [1.6.0]
- #169, Added an option for FASTQ - level outputs to be generated by demux. This is done with the flag "-q" followed by a directory path where files will be generated
- #178, in the case of a sample not having enriched peptides, enrich will now add a space to the empty file. This allows for better compatability with deconv through Qiime2.
Expand Down
3 changes: 3 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ project(PepSIRF LANGUAGES CXX)

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

cmake_policy(SET CMP0074 NEW)

set(Boost_USE_STATIC_LIBS OFF)
set(Boost_USE_MULTITHREADED ON)
set(Boost_USE_STATIC_RUNTIME OFF)
Expand Down Expand Up @@ -78,6 +80,7 @@ add_executable(pepsirf src/main.cpp)

# Library common to all modules
add_library(pepsirf_common STATIC src/modules/parsers/options_parser.cpp src/modules/core/options.cpp
src/modules/core/logger.cpp
src/modules/core/sequence.cpp
src/modules/core/file_io.cpp
src/modules/parsers/fastq_parser.cpp src/modules/parsers/fasta_parser.cpp
Expand Down
29 changes: 29 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,17 @@ After the branch that held the feature has been merged, it may be deleted.
After merging the branch, a note should be added to the [CHANGELOG.md](CHANGELOG.md) file under the
```Unreleased``` section giving a brief overview of the changes that were made.

# Merging into prerelease
Prerelease is a branch for features that have been tested by developers but have
not had sufficient testing by users. When a new release is ready but needs more testing
to ensure it's ready for an official release, it may be pushed to prerelease with an updated
version number of "Unreleased". No official release is created.

```
git checkout prerelease
git merge develop --no-ff
```

# Updating version number, creating a release
After one or more features have been implemented as described above, an official
version may be released. The version number of the new release should be created as
Expand Down Expand Up @@ -111,8 +122,26 @@ git merge develop --no-ff
git tag -a PEPSIRF_VERSION
git push
git push --follow-tags
git checkout prerelease
git merge master
git checkout develop
git merge master
```

If prerelease has features that must be merged into develop, follow these steps:
```
git checkout master
git merge prerelease --no-ff
git tag -a PEPSIRF_VERSION
git push
git push --follow-tags
git checkout prerelease
git merge master
git checkout develop *
git merge master *
```

*If develop has been developed past the point of prerelease do not include these lines


The new version of PepSIRF has been officially released!
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Get the base Ubuntu image from Docker Hub
FROM ubuntu:latest

# Specify the working directory
WORKDIR /app

# Install dependencies on the base image
RUN apt-get -y update && apt-get install -y \
g++ \
libboost-all-dev \
cmake

# Copy the current folder which contains C++ source code to the Docker image under /usr/src
COPY . /app

RUN rm -rf build

# Use Clang to compile the Test.cpp source file
RUN chmod +x build.sh && ./build.sh

# go back to app directory so that output files go there
RUN cd /app

# Run the output program from the previous step
ENTRYPOINT ["./build/pepsirf"]
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## GPL-3.0-or-later

### Current Version: v1.5.1
### Current Version: v1.6.0

Visit our [GitHub Pages website](https://ladnerlab.github.io/PepSIRF/)

Expand Down
6 changes: 6 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
services:
pepsirf:
image: pepsirf
build:
context: .
dockerfile: Dockerfile
2 changes: 1 addition & 1 deletion docs/1-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ permalink: /
<img src="./assets/images/PepSIRF_logo_BW.png" alt="" width="1024">


### Current Version: v1.4.0
### Current Version: v1.6.0

### Please cite:
[https://arxiv.org/abs/2007.05050](https://arxiv.org/abs/2007.05050)
15 changes: 15 additions & 0 deletions docs/2-installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,21 @@ chmod +x build.sh
```
Both of these options will create a ```pepsirf``` executable in the build directory.

### If OpenMP is not found on MacOS
First, ensure it is installed:
```
brew install libomp
```
Then, open ```~./zshrc``` and add ```export OpenMP_ROOT=$(brew --prefix)/opt/libomp``` manually, or use:
```
echo "export OpenMP_ROOT=$(brew --prefix)/opt/libomp" >> ~/.zshrc
```
Finally, run:
```
source ~/.zshrc
```
Then recompile PepSIRF.

### Running Tests
```
mkdir build
Expand Down
30 changes: 30 additions & 0 deletions docs/3-docker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
layout: default
title: Docker
permalink: /docker/
---

### Building the container

```
docker-compose up --build
```

**Excpected Error**:

```Error: Invalid module name entered```

This just means that the pepsirf command was not given arguments. You can ignore this since you will be providing arguments when running the image.

### Running the PepSIRF image

```
docker run --mount type=bind,src=<path/to/local/directory>,target=/app/<new_directory> \
pepsirf [ --help | module_name <module_args*> ]
```

**Note:**
- Replace <path/to/local/directory> with the actual path to your local directory.
- Replace <new_directory> with the desired name for the directory in the container.
- If you are specifying a file (for reading or writing) from the source directory, use /app/<new_directory/file>
- Make sure that Docker is granted permission for file sharing with the local directory.
File renamed without changes.
97 changes: 94 additions & 3 deletions docs/5-changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,102 @@ permalink: /changelog/

# Changelog

## Unreleased

## 1.7.0 | 2024-10-3

<strong>Docker: added new feature (Issue #254).</strong> Added the ability to run PepSIRF as a Docker image and added a page for instructions.

<strong>CMakelists: bug fix (Issue #197).</strong> Resolved CMake not locating OpenMP on MacOS. Tutorial for fix added to installation page.

<strong>Subjoin: added new feature (Issue #236).</strong> Added a functionality to the "-i" option in Subjoin to accept a regex pattern instead of a filename which contains sample/peptide names. The sample/peptide names used from the score matrix file will be filtered by whether they contain the regex pattern.

<strong>Demux: added new feature (Issue #234).</strong> Added "--unmapped-reads-output" option to Demux, which writes all reads that have not been mapped to a sample/peptide to the specified filename.

<strong>Deconv: added new feature (Issue #233).</strong> Changed Deconv "-t" option to accept a tab demilited file with a column for each TaxID and a column for the score threshold to use for that TaxID. The originally functionality still holds: if a number to included with option, each TaxID will use that score threshold.

<strong>Demux: added new feature (Issue #227).</strong> Demux outputs additional information about the total number of samples, the number of samples containing a given number of replicates, and the number of samples starting with "Sblk_". The replicate information with be written to the file provided with the option "--replicate_info".

<strong>Subjoin: added new feature (Issue #223).</strong> Added "--exclude" option to subjoin that changes the output data file to contain all of the input samples/peptides except the the ones specified by the user.

<strong>Demux: added new feature (Issue #221).</strong> Demux automatically truncates sequences in the library which are longer the than provided length through the "--seq" option. If a sequence is found to be shorter than the specified length, an error is thrown.

<strong>Deconv: added new feature (Issue #218).</strong> Added "--custom_id_name_map_info" option to Deconv which accepts a filename, the key column header, and the value column header in the file to use to link TaxIDs to taxon names. This option should be used instead of "--id_name_map" if the user wishes to define a tab-delimited ID name map.

<strong>Link: added new feature (Issue #210).</strong> Fixes crash in Link when a species does not have an associated ID. A single warning is logged which informs the user some species have not been considered and where to find a list of those species which should be reviewed.

<strong>Test: added new feature (Issue #152).</strong> Automated tests have been added and finished to test all recently added features and fixed issues in PepSIRF.

<strong>Enrich: added new feature (Issue #131).</strong> Provides more information in Enrich's failed enrichment output. Sample replicates which do not meet either threshold are identified in the output and are marked as either not meeting the minimum or maximum threshold.

<strong>Demux: added new feature (Issue #56).</strong> Alters behavior of Demux when ran in reference independent mode. In ref-independent mode, index toggling is turned off; therefore, if an exact match at the given index is not found, the read is discarded.

<strong>Logger: added new feature (Issue #2).</strong> Adds a system to handle logging PepSIRF's progress when running. A default file name is automatically generated with the module name, current time and date. An option '--logfile' which allows the user to provide a custom name for the log file.

<strong>Deconv: added new feature (Issue #36).</strong> Standardizes the order tied species are listed in Deconv output. If species names are provided, then the tied species are sorted by alphabeticall by their names; otherwise, they are sorted by their species ID.


## 1.6.0 | 2023-04-04

Version 1.6.0 adds several new features.

## New Features:

<strong>Demux: added new feature (Issue #169).</strong> Added an option for FASTQ - level outputs to be generated by demux. This is done with the flag "-q" followed by a directory path where files will be generated.

<strong>Enrich: added new feature (Issue #178).</strong> In the case of a sample not having enriched peptides, enrich will now add a space to the empty file. This allows for better compatability with deconv through Qiime2.

<strong>Enrich: added new feature (Issue #137).</strong> Added an option for enrich to drop replicates with low raw read counts. This is done with the flag "-l" or "--low_raw_reads". If this functionality is invoked, dropped replicates will not be considered in the enrichment process, and the dropped replicates will be reported in the enrichment failure reasons file under "Removed Replicates": each line will contain the replicates removed from a sample.

<strong>Enrich: added new feature (Issue #131).</strong> Enrich now reports which replicates caused a raw read count threshold failure; and identifies if a replicate failed the maximum or minimum threshold.

<strong>Deconv: added new feature (Issue #161).</strong> Added a flag to deconv that allows the user to specify what string is expected at the end of each file containing enriched peptides (set to "\_enriched.txt" by default). If a file without does not end in the string that was specified, deconv skips over that file.

<strong>Info: added new feature (Issue #149).</strong> Added feature to info that generates a matrix of average counts given replicates. Two new flags must be included in order to use this feature: --rep_names and --get_avgs. --rep_names requires an input file with the names of the replicates that the user wants to generate a matrix of average counts for. --get_avgs requires and output file name where the matrix will be stored.


## 1.5.1 | 2022-09-10

Version 1.5.1 fixes a bug and adds a feature.

### New Features:

<strong>Enrich: added new feature (Issue #154).</strong> Altered behavior of enrich to produce blank sample file output for samples that failed enrichment.

### Bug Fixes:

<strong>Demux: bug fix (Issue #168).</strong>fixed bug introduced in release 1.5, where amino acid level output is overwritten with peptide level output. This no longer occurs.


## 1.5.0 | 2022-06-02

Version 1.5.0 adds multiple features and removed OMP support for Clang compilation.

### New Features:

<strong>Demux: added new feature (Issue #35).</strong> If samplenames or index name sets have duplicates in samplelist file, then those duplicates will be output to the terminal.

<strong>Demux: added new feature (Issue #57).</strong> Demux now has an additional option for providing a tab-delimited file with 5 ordered columns: 1) index name, which should correspond to a header name in the sample sheet, 2) read name, which should be either "r1" or "r2" to specify whether the index is in "--input_r1" or "--input_r2", 3) index start location (0-based, inclusive), 4) index length and 5) number of mismatched to allow. Note: the last three columns correspond to the info currently provided on the command line with "--f_index" and "--r_index" (or "--index1" and "--index2", with recent changes). With this feature, the demux module can now analyze an arbitrary amount of indexes to be found in r1 or r2 input sequences.

<strong>Demux: added new feature (Issue #57).</strong> Demux output diagnostics may now provide more index matches for flexibility with demux changes in #57.

<strong>Demux: added new feature (Issue #138).</strong> Demux now automatically removes reference duplicates when running in a reference dependent mode.

<strong>Zscore: added new feature (Issue #105).</strong> A check is added that verifys the bins provided to the Z score module. It is no longer possible to run the Z score module with the wrong set of bins.

<strong>CMakelists: recognized issue with clang (Issue #162).</strong> Removed threading support on MacOS.

### Bug Fixes:

<strong>Demux: added bug fix (Issue #156).</strong> Solved memory race condition in demux created during development of this release.

<strong>Demux: added bug fix (Issue #163).</strong> Solved memory race condition in demux that created incorrect counts.


## 1.4.0 | 2021-07-09

Version 1.4.0 adds multiple features and one bug fix for s_enrich, p_enrich, and link. CMakelists has been updated and a new module ‘enrich’ has been introduced.


### New Features:

<strong>Module added: enrich (Issue #114).</strong> The p_enrich module was altered to allow for flexibility in the number of replicates for each sample and renamed ‘enrich’. This new module can now provide the functionality of both s_enrich and p_enrich, and therefore, these two modules will no longer be available. Additionally, this module is able to handle >2 replicates.
Expand All @@ -19,11 +110,11 @@ Version 1.4.0 adds multiple features and one bug fix for s_enrich, p_enrich, and

<strong>CMakelists: Big Sur support (Issue #117).</strong> ‘-Xpreprocessor’ has been added to the command setting CMake C++ flags in order to support compilation on Mac OS Big Sur.


### Bug Fixes:

<strong>Link: Issue #116.</strong> A vague and system-dependent error occurred when --protein_file sequence names were not found in the --meta file. Modifications have been made to properly handle this situation and provide a clear and consistent error message.


## 1.3.7 | 2021-06-28
Version 1.3.7 adds one feature and one bug fix to norm.

Expand All @@ -33,10 +124,10 @@ Version 1.3.7 adds one feature and one bug fix to norm.

<strong>Norm: Issue #104.</strong> The norm module help message for option (--peptide_score, -p) has been updated.


## 1.3.6 | 2021-06-09
Version 1.3.6 adds several features and fixes several issues in demux, zscore, and subjoin.


### New Features:

<strong>Demux: new warning (Issue #96).</strong> The module now includes a warning for the user when index names from the (--samplelist, -s) file are not included in the index fasta file (--index, -i).
Expand Down
8 changes: 4 additions & 4 deletions docs/Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ GEM
execjs
coffee-script-source (1.11.1)
colorator (1.1.0)
commonmarker (0.23.4)
commonmarker (0.23.9)
concurrent-ruby (1.1.10)
dnsruby (1.61.9)
simpleidn (~> 0.1)
Expand Down Expand Up @@ -225,14 +225,14 @@ GEM
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
mercenary (0.3.6)
mini_portile2 (2.8.0)
mini_portile2 (2.8.1)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.15.0)
multipart-post (2.1.1)
nokogiri (1.13.10)
nokogiri (1.14.3)
mini_portile2 (~> 2.8.0)
racc (~> 1.4)
octokit (4.22.0)
Expand All @@ -241,7 +241,7 @@ GEM
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (4.0.7)
racc (1.6.1)
racc (1.6.2)
rb-fsevent (0.11.1)
rb-inotify (0.10.1)
ffi (~> 1.0)
Expand Down
Loading

0 comments on commit 64d9cce

Please sign in to comment.