I pooped and got a number. Despite my hopes and dreams, that number was not 42, and my poop is unfortunately not the answer to the question of the meaning of life. This number I’m referring to is my diversity score. You might have seen it — many companies and services that assess gut microbiome composition indicate how “diverse” you are relative to other people. But what do those numbers mean? Well, it turns out that there is no single interpretation of “diversity” and in fact, there are quite a few different ways to assess diversity. In this blog post, we’re going to discuss a few of these concepts within the field of microbial ecology, or rather, poopology given the present context. As background to frame the posting, I’m going to discuss one of my own sample sets from within the American Gut, and I’ll do a custom analysis on these samples (which you can find here). I’ll discuss this analysis over two or more blog posts. The samples I’ll be analyzing correspond to a “thai chili” intervention in which I collected samples for a few days on a normal diet, then ate a TON of thai chilis on an empty stomach, followed by collecting samples for another few days. The motivation: to explore the burning. That, and capsaicin is a natural antimicrobial so I wanted to know if a community shift would result. It was an N=1 trial, however, for some reason, no one else wanted to participate…
Before we go any further, let’s define a few terms. When microbiome scientists talk about diversity, we’re usually referring to measures of alpha- or beta- diversity, which are concepts that originate in macrobial (i.e., classical) ecology. Alpha-diversity refers to the amount of diversity within a sample, whereas beta-diversity refers to the amount of diversity shared between two samples (and there is also gamma-diversity which refers to total diversity of all samples, but this doesn’t come up much in microbial ecology). Sometimes, researchers will also talk about taxonomic diversity, which is in effect a summarization of the data. In this blog post, we’re going to focus on alpha-diversity, and discuss a little bit on taxonomic diversity.
So what is alpha-diversity? For the purposes of demonstration, let’s imagine a forest in which we’ve assessed the number of different types of plants, as well as the count of those plants, in geographically distinct sampling locations. Now, let’s define a simplistic alpha-diversity metric: given a sampling location, under our metric we’ll state that the diversity of the location is the number of unique types of plants present. If we computed this metric over multiple sites, we might then be able to say that site X was more diverse that site Y. But, so what? Well, let’s say we hypothesized that sites near rivers were more diverse than mountain tops; if we had a large number of samples, we could see if the data support the hypothesis by asking whether the diversities were statistically significantly different between sites at rivers and sites at mountain tops. While this seems simplistic, basic science often involves incremental steps that lead toward more specific hypotheses, such as why one type of site may be more diverse than another.
But, does our metric make sense? Does diversity simply break down to the number of types of organisms present? Generally, no, and here’s (one reason) why. Let’s assume that in our hypothetical study, that we observe the river samples to be more diverse under our metric. But, what if all of the river sites tend to be dominated by alder trees? Or, in ecologist speak, perhaps the river sites have a high “richness” but low “evenness.” On the mountain tops in our hypothetical study, we might observe fewer different types of plants but, of the plants we do observe, we may observe them at fairly similar abundances. In this situation, our “richness” might be lower compared to the river samples, but the “evenness” is higher. The metric that we had defined for alpha-diversity only accounted for species “richness” and had our metric emphasized evenness, then we might have stated that the mountain tops are more diverse!
So what is the right way to account for diversity within a sample? There isn’t one. There are many different ways to calculate a diversity score. Some metrics are generally more useful than others, and some metrics are more appropriate in certain circumstances. For this blog post, I’m going to discuss two different types of alpha-diversity (you can see a list of many common ones here). Specifically, I’m going to take a look at Shannon Diversity and Faith’s Phylogenetic Diversity (PD) as these are quite common to see in the microbiome field.
Shannon Diversity is awesome because it is based off of Information Theory which was put forth by Claude Shannon in the 1940s to tease apart signal and noise in radio waves. In brief, Shannon Diversity is assessing the “entropy” of a given sample, where a higher entropy indicates a more diverse sample. The calculation takes into account both the richness and evenness of the sample.
Faith’s PD, on the other hand, accounts for the amount of phylogenetic branch length represented by a sample. A phylogeny is a hypothesis about the evolutionary history of the organisms in a sample (e.g., how related your bacteria are), and the branch length is a proxy for the amount of evolutionary distance. The intuition being that a sample which spans a broader portion of the tree of life represents more evolution, and probably has a more diverse repertoire of genes. Faith’s PD does not take into account the evenness of the sample.
Now, back to my poop. In order to gauge how diverse my gut is, I’m going to first get an idea for the distribution of alpha-diversities seen in the rest of the American Gut. The specific samples of mine that we’ll be exploring come from a thai chili experiment I did back in 2013. The study design was simple: eat a regular diet and collect samples daily for 3 days, then eat a large quantity of thai chilis on an empty stomach, and post intervention, collect samples daily for the remaining 3 days while on a normal diet. To improve comparability to the rest of the American Gut population, I’m going to restrict the comparison of my diversity to only those individuals who appear to be healthy, and who are approximately like me (e.g., age, BMI, and health).
Figure 1. Alpha diversity histograms under Shannon Entropy and Faith’s PD. The points shown correspond to samples collected prior to and following the chili pepper intervention, where white indicates an early sample and red indicates a later sample. The green line is the median of the background histogram.
And while we’re at it, let’s see how the taxonomy of my microbiome compares to everyone else:
Figure 2. A phylum level taxonomy summary showing the relative abundances of the top 5 phyla. Others refers to the average relative abundance for everyone not me.
Fascinatingly, you can see that my diversity changes dramatically from day to day! And, there isn’t even a clear trend following my “intervention” which, I guarantee, resulted in a marked physiological effect. It turns out that our microbiomes are quite dynamic (an excellent video on its plasticity can be found here), and this plasticity is likely impacted by a variety of factors including lifestyle, diet, sampling biases (it isn’t too difficult to imagine that the lush cove of a corn kernel is different from the exposed surface of a raspberry seed), and many other things. So what’s the intuition here? Shannon Entropy is factoring in both the “richness” and “evenness” of my sample, so it seems I probably have a similar number of organisms at comparable relative abundances to the average person similar to me. PD is only looking at the relationships present in the “richness”, and so it seems that the extent of the tree of life in my gut also fluctuates quite a bit. A big red flag of caution: alpha-diversity does not take into account the similarity of the organisms between samples, so while I my diversity may be “normal”, it does not indicate if I’m low or high in the same organisms as the population.
We can see as well that my taxonomic profile changes quite a bit as well. Interestingly, I seem to have a very low relative abundance of Verrucomicrobia. One of the more notable “Verrucos” is Akkermansia, which lives in your mucosal layer. A reduced level of Akkermansia is associated with individuals who have inflammatory conditions, however, these data are insufficient to show that I actually am diminished in Akkermansia. Furthermore, it is still unknown as to whether being low or high for that organism is a determinant in health (but please see this awesome study in mice). As with the alpha-diversity summaries, you can see that our perspective on the microbiome changes quite a bit day to day. Despite these changes, you tend to resemble yourself over time relative to others. But, more on that in part 2 next week!
Daniel McDonald is the former project manager for the American Gut, and is now a bioinformatics scientist at the Institute for Systems Biology focusing on the Wellness 100k Project.
NOTE: The author of this post, Daniel McDonald, is intentionally identifying which samples are his. All participant samples are always de-identified, although participants have the right to identify their sample(s). This decision is solely at the discretion of and under control of the participant.