QIIME 2 will revolutionize microbiome bioinformatics

By Greg Caporaso, PhD

It’s official: QIIME, the primary microbiome bioinformatics platform used by the American Gut project, is now NSF funded. This is a very exciting step for the QIIME development team, and we’re already hard at work building the platform that we expect to revolutionize microbiome bioinformatics.

From a user perspective there are a few key features to look forward to in QIIME 2. First, we’ll be completely rewriting our visualization framework to support interactive visualizations that will greatly simplify exploratory analysis, and allow for exporting of publication quality graphics. Users will no longer have to run multiple QIIME scripts to sort, filter, and group data into the desired figure — simply point-and-click to produce beautiful and informative visualizations! We’ll also be providing support for many updated analytic tools, including ANCOM for identifying differentially abundant OTUs, diverse sequence count normalization techniques (including those discussed in McMurdie et al., 2014 – those are now available in QIIME 1, but will be central to QIIME 2), and new quality control and OTU assignment tools, including vsearch, DADA2 and swarm. These tools will all be made available as QIIME 2 plugins, and we will provide detailed plugin developer documentation so that it’s straight-forward for other bioinformatics developers to make their methods accessible in QIIME. For our users, this plugin-based approach means quicker access to the latest tools: you won’t have to wait for a new QIIME release to get access to the latest functionality, as new plugins can be easily added to an existing QIIME deployment.

One of the most exciting new features from my perspective, and one of the reasons why I think QIIME 2 will revolutionize microbiome bioinformatics, is that QIIME 2 is completely interface-agnostic. We provide a software development kit (SDK), which developers can use to build interfaces around QIIME. QIIME 1 was tightly coupled to its command line interface. This worked well for power users, but supporting graphical interfaces for QIIME 1 is very challenging. As a result, existing graphical interfaces for QIIME 1, such as Qiita and BaseSpace, generally provide access to only limited functionality. The QIIME 2 SDK will make it straightforward to develop diverse, fully featured interfaces, including graphical interfaces (for end users) and command line interfaces (for power users). We’ll provide some of these, but we’re also very excited to see new interfaces that the community develops. Some additional exciting features include a semantic type system, which will help guide users to relevant analyses (and help them avoid invalid analyses), and decentralized provenance tracking, which will help users keep track of where their data from and which ultimately will be used to automate the generation of “Methods section” flowcharts. Taken together, these features will make QIIME 2 accessible to anyone, and will improve the quality and reporting of microbiome data.

As promised in my recent QIIME Blog post, Toward QIIME 2, we now have an experimental QIIME 2 web interface available as a public prototype. We’re currently working on finalizing some components of the framework, developing a command line interface and a web interface, and interfacing QIIME 2 with Qiita to support meta-analysis of microbiome and mutli-omics data sets. Once these pieces are in place, we’ll be coordinating with a large team of collaborators to develop plugins. We’re still on track to have an alpha release out this summer, which we’ll present at SciPy 2016 and in the ISME 16 Bioinformatics workshop (the official announcement for that will go live next week).

So what does all of this mean for American Gut participants? First, it means that your results will get to you faster, as the bioinformatics processing will be more straight-forward. It also means that the data you receive will be based on the most recent methods, and will be presented to you using the latest generation visualization techniques which will make it easier to interpret. And finally, for users who are so inclined, QIIME 2 will provide a very straight-forward way for you to access your own raw data (and others’ de-identified raw data) and perform your own custom analyses.

Finally, thanks to the QIIME developers, the National Science Foundation, and our active user community. All of this is only possible because of you! To stay up-to-date with news on QIIME 2, follow us on Twitter. Happy QIIME-ing!

Greg Caporaso is an Assistant Professor at Northern Arizona University, one of the lead developers of QIIME, and principal investigator on the 2016 NSF QIIME 2 grant mentioned here.