MSFragger is an ultrafast database search tool for peptide identifications in mass spectrometry-based proteomics. MSFragger has demonstrated excellent performance across a wide range of datasets and applications. The speed of MSFragger makes it particularly suitable for the analysis of large datasets (including timsTOF data), for enzyme unconstrained searches, and for ‘open’ database searches (with the precursor mass tolerance set to hundreds of Daltons) for identification of modified peptides.
MSFragger is implemented in the cross-platform Java programming language, and is available as a standalone JAR file. It is compatible with standard open file formats for mass spectrometry data (mzXML/mzML). It writes output in either tabular or pepXML formats, making it fully compatible with downstream data analysis pipelines such as Trans-Proteomic Pipeline and Philosopher.
If you never downloaded MSFragger before, please complete steps 1-3. To upgrade to the most recent version from a previously downloaded version (JAR file), skip to step 3.
- Complete the license agreement form.
- Download the initial release of MSFragger software using instructions received by email.
- Once you obtained the MSFragger software, the latest version of the software can be downloaded (under the same license terms as the original version) using the Upgrade site.
On Windows, the easiest way to run MSFragger is using FragPipe GUI (Graphical User Interface).
FragPipe includes additional tools such as Philosopher (for downstream analysis with PeptideProphet and ProteinProphet), label-free quantification, FDR filtering, and report generation (at the PSM/ion/peptide/protein-levels). It also includes DIA-Umpire SE module for DIA data and SpectraST-based spectral library building module.
To run MSFragger using the command-line:
java -Xmx20g -jar <path to msfragger.jar file> <path to fragger.params file> <path to mzML/mzXML/MGF files>
-Xmx20g
specifies the maximum memory assigned to Java virtual machine. In this example, the maximum value is 20 GB. This needs to be changed depending on the computer configuration.
Detailed command-line options can be displayed with:
java -jar <path to msfragger.jar file>
For more information on how to run MSFragger/Philosopher tools using command-line see Tutorial.
When searching very large sequence databases, performing nonspecific searches, and/or specifying many variable modifications, it may be necessary to use the database splitting option in FragPipe. This option requires Python installation. If running using command-line, download a Python script and run MSFragger using the following command:
python3 <path to msfragger_pep_split.py file> <num> "java -Xmx20g -jar" <path to msfragger.jar file> <path to fragger.params file> <path to mzML/mzXML/MGF files>
Replacing <num>
with the number (e.g. 4) of slices for database splitting, and also changing the maxium allowed memory as described above.
The latest version of MSFragger was released on 2019-05-30. Check here for the full list of MSFragger versions and changes.
For documentation on MSFragger itself (hardware requirements, search parameters, etc.), see MSFragger Documentation Wiki page.
Please post all questions/bug reports regarding MSFragger itself on the MSFragger GitHub page, or if more appropriate on FragPipe page or Philosopher page.
If you would like to propose a new collaboration that can take advantage of MSFragger and related tools, please contact us directly.
Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nature Methods 14:513–520 (2017). Manuscript.
For other tools developed by Nesvizhskii lab, go to our website www.nesvilab.org