Wiki

Clone wiki

EncyclopeDIA / Home

EncyclopeDIA Splash Image

Data independent acquisition (DIA) mass spectrometry is a powerful technique that is improving the reproducibility and throughput of proteomics studies. EncyclopeDIA is library search engine comprised of several algorithms for DIA data analysis and can search for peptides using either DDA-based spectrum libraries or DIA-based chromatogram libraries. Check out our manuscript describing EncyclopeDIA at Nature Communications (Searle et al, 2018) for more information. EncyclopeDIA contains Walnut, an implementation of the PECAN (Ting et al, 2017) scoring system, to enable chromatogram library generation from FASTA protein sequence databases when spectrum libraries are unavailable. EncyclopeDIA also supports Prosit, a deep learning tool for generating peptide fragmentation spectra, as described in our new methods paper (Searle et al, 2020). EncyclopeDIA also contains Thesaurus for localizing and quantifying PTMS with DIA experiments (Searle et al, 2019) and Scribe for analyzing DDA data with libraries.

How do I get EncyclopeDIA?

EncyclopeDIA is open source under the Apache 2 licence, which means you can do what you like with the software, as long as you include the required notices. You can download the latest stable version or look at the manual.

Note: EncyclopeDIA and associated tools currently require Java 16 or lower. We recommend using Java 8 for long-term support and stability.

Stable versions: Release Date Major Changes (see changelog for full details)
encyclopedia-2.12.30 2022-12-30 Added Scribe for analyzing DDA data into the main build. Also added quant speedup for large datasets and several improvements to utility functions. Check the changelog for details
encyclopedia-1.12.31 2021-12-31 Removed Log4J dependency, saving/reusing Percolator weights, RT harmonization when combining libraries, official Docker image searlelab/encyclopedia:1.12.31
encyclopedia-1.2.2 2021-02-02 Improved peptide quantification and scoring. Also added library generation support for MS2PIP and OpenSwath, added library export for MSP
encyclopedia-0.9.5 2020-06-05 Added Thesaurus into the main build. Also additional options for Percolator, PTMs, the Window Scheme Wizard, and support for command line conversion
encyclopedia-0.9.0 2019-06-27 support for Prosit libraries, see this new paper! Also Maxquant, Spectronaut, and Progenesis
encyclopedia-0.8.1 2019-01-13 new Window Scheme Wizard
encyclopedia-0.8.0 2019-01-04 improved file converters, new visualizations, update notifications
encyclopedia-0.6.14 2018-03-02 original upload used in Searle et al, 2018

How do I start collecting DIA?

You can check out a DIA quick start document you're interested in how to set up your instrument to collect DIA data or check out recommended starting settings for DIA on Thermo instruments. Our new "tutorial", Pino et al, 2020, is a great place to gain intuition about why these methods and settings make sense. This content is consolidated in our ASMS 2020 talk, which has full lecture notes and an accompanying video on YouTube.

How do I get or make libraries?

Walnut can be used for searching DIA files without a library. However, we recommend using libraries for maximum sensitivity. If you're doing work with Human samples (including tissue or cells), you can download an EncyclopeDIA-compatible version of the Pan-Human library created from the Rosenberger et al, 2014 dataset or a Thesaurus-compatible version of the Phosphopedia library created from the Lawrence et al, 2016 dataset. Alternatively, if you're doing work with non-human samples, you can download a pre-generated Prosit library or read our tutorial on developing new Prosit libraries using only a FASTA database.

Finally, EncyclopeDIA is designed to enable the chromatogram library workflow. This DDA-free strategy builds sample-specific libraries using gas-phase fractionation (GPF) DIA. GPF-DIA improves detection rates by injecting the same sample six times, from 400-500 m/z, 500-600 m/z, etc. This allows each run to have PRM-quality DIA windows (2 m/z or less) with the same instrument duty cycle. While offline fractionation such as SCX or high-pH RP requires an additional liquid chromatography step, GPF occurs completely within the mass spectrometer, making very easy to perform. The 6x GPF-DIA files can be searched with Walnut or a Prosit-based library to produce sample-specific chromatogram libraries, which are in turn used to search normal (wide-window) DIA files.

How do I use EncyclopeDIA?

EncyclopeDIA is a cross-platform Java application that has been tested for Windows, Macintosh, and Linux. Specifically, we have tested it under Windows 8 and 10, Mac OS X 10.11-15, and RedHat Linux 4.4. The EncyclopeDIA GUI can be opened by double clicking on the EncyclopeDIA .JAR (e.g. EncyclopeDIA-0.9.5-executable.jar). EncyclopeDIA can also be used at the command line and there are more detailed instructions for this available in the FAQs. Finally, official docker images are available at DockerHub (searlelab/encyclopedia).

EncyclopeDIA requires 64-bit Java 1.8. If you don’t already have it, you can download either the “Windows x64 Offline”, "Mac OS X x64 .DMG" or "Linux x64" link from the Java SE Runtime Environment 8 downloads website, depending on your specific operating system. The download and installation time should take only a few minutes.

Check here for FAQs.

Check here to report Issues.

Check here for file format guidelines.

Can I get commercial support for EncyclopeDIA?

EncyclopeDIA is a fully free and open source software package, and the Searle lab is committed to promoting, supporting, and extending free and open source tools. We have tried to make EncyclopeDIA and associated DIA documentation as user friendly as possible. However, if the open source support for using EncyclopeDIA is insufficient for your needs, Scaffold DIA is a commercial product from Proteome Software that uses the EncyclopeDIA library under the hood. As a disclaimer, Brian Searle is a stockholder of Proteome Software. The Searle lab only produces open source software and does not contribute commercial code to Scaffold DIA, or any other commercial product.

Who do I talk to?

This is a Searle Lab and MacCoss Lab project from the Department of Biomedical Informatics at the Ohio State University and the University of Washington, Department of Genome Sciences. For more information please contact Brian Searle (brian dot searle at osumc dot edu).

Contribution guidelines

Code contributions are welcome, and thank you to all code contributors! Any contribution must follow the coding style of the project, be presented with tests, and stand up to code review before it will be accepted. Please read our code of conduct guidelines before participating in our community.

Updated