Statistical Measures for Species Diversity

Introduction

In this module we are going to start looking into statistical techniques which can help us to explain species diversity across different environments.

Unpacking the different components of diversity analysis: richness, abundance, dominance and evenness
Reviewing some of the common indices used to bring together these measurements - the Shannon and Simpson indices
Highlighting important caveats and limitations around these measures of diversity, and introducing some strategies for where to go next after an initial analysis made through these measures.

The terminology used is often a little abstract, and often feels a little unconnected to other areas of statistics, and the caveats around the usage of these measures are extremely important to recognise when trying to interpret measures of diversity.

I talk through in the video here:

Data Used

The data is taken from the 2024 RSPB "Big Garden Birdwatch" - a citizen science project encouraging anyone in the country to record and take photos of which birds they observed in the area around their homes on a particular day. This data is summarised at county level and provides the average number per household for each of 80 different species that were recorded.

The data is made up of reports provided by 610,000 households in the UK, and a total of 9.5 million birds were reported by those households.

We do have to be a bit careful with this data as a result - since there may be some potential bias due to the way in which it was collected. In each county there would have been different numbers of people reporting, and the numbers reporting may not have been proportional to the population sizes, or evenly distributed. There would also be questions over whether households were able to accurately identify each bird, particularly for those less common species.

Components of Diversity

Within any data which has counted observed frequencies of different species there are two easy ways in which we can start to assess the diversity within each observed are:

how many different species were reported
how many total species were reported, summing across all of the different species

These two concepts, while simple, are fundamental to statistical ecology and are referred to as the "Species Richness", the total number of different species within the area of interest, and the "Total Abundance", the total number of species present in the given area.

The Abundance would be the total number of a given species within a given area of interest. So in this data we may be thinking of this in terms of, for example:

"the abundance of Skylarks with Wiltshire"
"the total abundance of all birds within Wiltshire"
"the species richness of birds within Wiltshire"

Richness

For the data we are exploring it is easy to get the assessment of the species richness by identifying how many of the 80 species were observed:

Dumfries and Galloway had the most different number of bird species identified, 52, and the Isles of Scilly had the fewest.

Abundance

The abundance, and total abundance, are a little more challenging! We are dealing with very large geographic areas (counties), with non-randomly selected sampling units (households who chose to participate), a three-day observation period and a count of species which by definition do have a tendency to move around quite a lot (birds).

So it would be impractical to try to calculate any sort of estimate for the total number of birds per county based on the data collected. But what we can estimate from the data is a "per-household" estimate of the abundance of the different bird species - which will not impact on any of the statistical measures in terms of calculation but only in terms of interpretation. When dealing with these sort of diversity analyses it is often necessary or advisable to be considering making adjustments like this to enable for a better understanding of the site diversity.

Having a "per-household" estimate of species abundance is in itself not so common outside of citizens science initiatives like this; instead it is a convenient measure to give a proxy for the overall abundance. But making adjustments of abundance and diversity per area would be extremely common. This could also be something we could consider with this dataset.

Another common adjustment would be to go beyond the "abundance", of simply counting the species, and instead consider the biomass of our species. This would be particularly important when considering an ecosystem containing species of very different sizes:

For example: In Woodland A we observed 6 foxes, 20 rabbits and 100 ants In Woodland B we observed 1 fox, 1 rabbit and 250 ants

The 'total abundance' in Woodland B is about double that in Woodland A; but the "total biomass" in Woodland A is around 9 times larger than that of Woodland B.

Given the more limited range of biomass between different bird species, that sort of adjustment is unlikely to make too much of an impact here.

Let's compare how the total abundance (per household) and the species richness compare to each other:

We can see there is, perhaps surprisingly, very little correlation between the two measures across our counties. We already noted that Dumfries and Galloway had the most different species of any county, and this also had a high average number of birds observed per household. But there are many counties with only a few less species identified that have a much lower average abundance - Dorset, Somerset and Edinburgh for example. And there are also counties with low numbers of different species that have even higher average numbers of birds identified.

Hovering over those three points you may notice a common pattern: they relate to Isles of Scilly, Shetland Islands and Eilean Siar (formerly known as the Western Isles).

Dominance

One aspect linked directly to the relative abundance of species that can be useful to help our understanding of diversity is "dominance". The "Berger-Parker" dominance score is one which is very simple to calculate - it is simply the proportion of the total abundance that comes from the single most dominant species.

Let's take a look at two of our more extreme counties Cornwall and the Shetland Islands:

The dominance of the most common species in Cornwall, the House Sparrow, is fairly low - it represents 16% of the total population of birds observed. In the Shetland islands almost 50% of all birds observed were starlings

This simple measure of dominance immediately gives us some insights into the diversity of the two counties - but it is also a very limiting statistic. It will, by definition, only focus on the most extreme species.

In fact, with the Shetland Islands, were we to consider the two most dominant species then we would see an even more extreme difference in this metric. The top two species account for 75% of the total birds observed in the Shetland Islands. So to try to generalise this idea more widely, beyond the single extreme point or making an arbitrary choice about how many "top" species to include in our measure of dominance, then we consider the concept of "evenness".

Evenness

The final component which contributes towards the measures of diversity - evenness - tells us how evenly distributed each species is within an environment. For example: In both County A and County B we observed 10 different species, and an average total of 20 birds. In County A - all 10 bird species were observed an average of 0.5 times per household In County B - one bird species was observed an average of 19.1 times per household, and the other 9 were observed an average of 0.1 times per household.

County A has a much higher evenness than County B; despite both having the same species richness and the same total abundance.

When considering evenness and comparing it across different environments we are no longer thinking about "abundance" of each species in an environment, but thinking about the "relative abundance" of each species. That is what percentage of the total species in a given environment belong to a given species.

A common measure of evenness is known as "Pilou's J" - where a value of 1 would indicate the species were all observed with equal frequency, and the smaller the value the less even the distribution. We will come back to the formula for this in the next section, but first let's see how this measure relates to the species richness and total abundance:

From these plots the first thing that catches our attention is the four very different looking counties, with low richness, low evenness and a high average count of birds per household. From this, without looking at the data, we can infer that in these counties there is likely to be a very small number of bird species observed a very large number of times per household on average.

Ignoring the four island counties we can see a slight negative correlation between evenness and richness - those counties with more different species of birds tended to have less even spread in the relative abundance of each bird.

Our relationship between evenness and abundance is much less strong; in fact outside of our island counties there seems to be almost no correlation between the two.

Let's pick out some of the more extreme counties, to see how this looks in relation to the actual data itself:

The relative abundance within the Shetland Islands is dominated by just two species
Both Dumfries and Galloway and the Isle of Man have fairly evenly spread distributions - although house sparrows are common there are many other species contributing substantially to the total. But Dumfries and Galloway has more different species, hence a higher richness, and most of the species observed in Dumfries and Galloway were rare, which is why it scored slightly less in evenness than the Isle of Man.

Diversity Indices

Diversity indices are a way of summarising down the trade-off between evenness and richness of species within a given area to a single number, for ease of comparison and use in further analysis. This is known as "alpha diversity".

There are two fairly commonly indices the Shannon-Weiner index and the Simpson index, both of these look to provide a trade-off between the richness and the evenness. Part of the reason why they are so commonly used is because the formulae to calculate them are relatively simple:

In most situations they will be extremely highly correlated, and we can see that is the case here:

But there are important differences between, as they are not always guaranteed to show the same results.

Simpson Index

Simpson's index can be interpreted as the probability that any two individuals randomly selected from within the population will come from different Species. If there is just one species present in a population by definition this will be 0. As the number of species increases the maximum possible score for the Simpson index will become closer to 1. And as the evenness of the distribution between species increases then the score will tend closer to the maximum potential score for the index given the number of species observed. e.g. if just 3 different species are observed the maximum possible score of the Simpson index is 0.666, which would be achieved when all three Species have equal relative abundance.

Shannon-Weiner Index

The larger the value of the Shannon-Weiner index, then the higher the diversity that exists within the population. Although it is difficult to give the number any direct meaning or interpretation that is particularly useful (although some people do try) the general interpretation would be that a value of less than 2 indicates low diversity; more than 2.5 indicates moderate diversity, and a value of more than 3 indicates high diversity.

Theoretically there is no upper limit on the Shannon Index - the largest value for a given number of species identified would be the log of the number of species identified, but it is extremely unlikely to find a situation where it will be less than 1 or more than 4.

A general interpretation of the value of the index can be given that it is equal to the log of Richness multiplied by the Evenness.

And as such the value of evenness we discussed in the previous section, Pilou's J, is derived directly on this basis. Pilou's J is calculated by taking the value of the Shannon Index and dividing it by the log of the number of species observed.

Differences between the two methods

The Simpson method places slightly greater weight onto the evenness of the species, and the Shannon-Weiner method will put slightly more emphasis on the richness.

You can see this by comparing Merthyr Tydfil and North Lanarkshire - both score identically on the Shannon-Weiner index, but differently on the Simpson index. Merthyr Tydfil has a higher richness than North Lanarkshire but a lower evenness. These cancel each other out on the Shannon-Weiner index, but the weighting in favour of evenness gives a higher value for the Simpson index for North Lanarkshire.

County	Species Richness	PilouJ	Shannon	Simpson
Merthyr Tydfil	44	0.75	2.85	0.91
North Lanarkshire	40	0.77	2.85	0.93

The reverse can be seen when comparing Devon and Merseyside - the Shannon index score indicates higher diversity in Devon over Merseyside, where the Simpson index gives them both the same score. And in this case Devon has a higher richness score and a lower evenness score, and this is sufficient with the weighting in favour of richness in the Shannon index to give a higher score for Devon, whilst the two cancel each other out in the Simpson index.

County	Species Richness	PilouJ	Shannon	Simpson
Devon	50	0.76	2.98	0.93
Merseyside	41	0.77	2.87	0.93

Important Considerations & Limitations

Sampling Design

In this case we have used an observational citizen science dataset to try to assess diversity.

There are many potential risks involved with this approach:

Whether all participants have correctly identified the species, particularly important for the rarer species
The extent to which the same birds may have been counted on multiple occasions - which could happen within the same household; or the same bird being counted by multiple households. Given birds are not particularly prone to sticking to administrative boundaries, the same bird may even be counted across multiple counties in the survey!
The characterisation of "county" as the environment is also problematic for multiple reasons, which will explore in the next section. But fundamentally there is also no scientific reason why an administrative unit would be a good unit of comparison, compared to an environment classified based on ecological or geographic factors. It is a convenient one though, and also likely helps with the promotion and publicity of the survey, since is a very human level of aggregation!
The geographic spread, and number of households participating would be reliant on self-selection. It's likely there may be some highly engaged community groups participating in the survey, but these would largely all report quite similar findings which may lead to bias in the results, compared to other communities where few people may chose to participate.

But with those challenges also come benefits which could not be achieved through any other method! The scale of the survey, covering the whole of the UK, and the regularity in which it is conducted using the same methodology every year, gives an incredibly rich data source that would be impossible to achieve through other methods.

When designing your own studies for these purposes it is really important to think through design aspects for exactly how you plan to collect the data - adapting to the context of the environments you are studying, and the behaviours of the species involved. Other potential options could be:

Purely observational data, as we have seen in the bird survey
Analysing physical samples - e.g. counting insects within soil samples taken from transects; or classifying micro-organisms within water samples at different sampling locations
"Catch and release" methods for larger species, using traps, and potentially using electronic tags to then be able to monitor the patterns of the same individuals over time, and ensure there is no "double counting"
Setting "video traps" for even larger species still, where human observation may impact upon usual behaviours
All would have slightly different sets of pros and cons, and may be better adapted to certain contexts.

One aspect that was easy to define in this study was the taxa, the genus which would be considered within scope for the survey. The objective was to record all birds, which is a pretty straightforward category to define the limits of. But if your focus is more related to understanding the full diversity within environments, then you would have to think carefully about which species would be included. If we wanted to try to classify all possible macro-fauna, flora, insects, mammals, birds, along anything else that may exist within our environment, then we would be setting ourselves a pretty tough challenge and one where the results of any analysis would be hard to fully grasp compared to breaking down the species into more coherent groups.

Sample Size & Rarefaction

One thing we haven't considered about the strangeness of the four island counties is whether the results may be different to other counties not because of reduced diversity in the bird populations but because fewer people live there. This data comes solely from volunteer members of the population recording the birds they see in their garden and then choosing to report. We would expect fewer people to report results in counties with small populations, and the remote island counties, as well as being quite geographically distinct, are also those counties with the lowest populations.

This is particularly important in diversity analysis since observed species richness is extremely dependent on the sample size, length of observation period and area covered by the sample. Within this survey there may have been tens of thousands of households participating in Cornwall, but only tens of households participating in the Scilly Isles. When thinking about "richness" in the data we see most of the species reported are 'rare', only reported a low number of times per county. The lack of observations of rare birds in some of the counties does not necessarily mean that they were not present at all - the more different locations and more people observing over a longer period of time, across a wider geographic area within the county increases the potential for the rarer species to be identified.

With a small number of observations and a small observation period we are almost certain to under-estimate the species richness, and thus also underestimate the "true" values of our diversity indices.

In order to be able to make direct comparisons we would need to assume that the theoretical maximum capacity for diversity is the same in each of the environments we are comparing. The Simpson and Shannon indices are not measures which lend themselves particularly well to being adjusted directly "per-capita" or "per-area". Cornwall with an area of 3,500 sq-km would definitely seem to have a much higher capacity for more different birds being observed as compared to the Isles of Scilly with an area of 16 sq-km.

But there are more nuanced methods to allow for improvements to the comparability between these measures through adjustments called "rarefaction". You can read more about this here: https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2019.02407/full https://www.scielo.sa.cr/scielo.php?script=sci_arttext&pid=S0034-77442009000300001

Without the raw data at household level this is not something we can really pursue too much further with this dataset.

Other ways of considering diversity measurement

These measures of "alpha" diversity also fail to capture diversity across different environments. Even where we do compare the Shannon/Simpson index across environments they only compare the internal levels diversity - they do not account for whether the same species are being seen in different environments, whether there exist certain species are seen only in small numbers of environments, or certain environments with large numbers of rare species, or whether different species are dominant in different environments.

These are all concepts which link to what is known as "beta" diversity, which focuses on this diversity in the species population across environments. There are a lot of different metrics that can be used to understand this concept of "Beta Diversity" by comparing the similarity in the populations of between sites.

There is an often cited paper that outlines 24 different options here: https://besjournals.onlinelibrary.wiley.com/doi/10.1046/j.1365-2656.2003.00710.x

But that is only the starting point - as that paper solely considers beta diversity in terms of presence/absence - there are many more than that when starting to consider the difference in abundances between sites! There are many methods for continuing further with this investigation into diversity - techniques like cluster analysis or network analysis can help to build a picture of how species may coexist with each other, and which environments have similar ecological profiles.

There is a good overview of the limitations of "alpha" diversity measures, and some of the entry points to understanding beta diversity here: https://ecologyforacrowdedplanet.wordpress.com/2016/01/14/beta-diversity-what-is-it-good-for/

Exercises

In these exercises we are going to look at data from a study looking at the rodent populations near to Albuquerque, New Mexico.

Data is available from 1989 to 2008, with the same sampling processes and locations used each year through a "trap and release" sampling procedure. You can read more about the trial and the methods they used here:

https://biotime.st-andrews.ac.uk/selectStudy.php?study=56

(And if you would like to look at some cute pictures of rodents to distract you from statistics for a while - I highly recommend searching for pictures of some of these species based on their name!)

And you can download the raw data used here:

Download Data

The general question we want to explore is:

Has the diversity of the rodent population within the study area changed over time? And if so how?

Below you can explore the data in a few different ways below to help you make this assessment, by looking at different visualisations of the data over time and different diversity statistics.

Remind yourself of what each of the diversity statistics can tell you, and what limitations they may have. See if you can link the diversity statistics to the raw data.

And this dataset captured both the abundance and the biomass of the rodents, you can choose to explore the diversity statistics and trends based on either biomass or abundance.

How does changing the choice of variable to assess diversity here change the interpretation of diversity in the population?

Which statistics are most affected by the change of variable?

Tables of Summary Statistics

Statisics based on:

Within Year Species Counts

Statisics based on:

Year to Plot

Time Series of Summary Statistics

Statisics based on:

Summary Variable to Plot

Time Series of Each Species

Statisics based on:

Resources

Choosing and using diversity indices: insights for ecological applications from the German Biodiversity Exploratories https://pmc.ncbi.nlm.nih.gov/articles/PMC4224527/

Direct overview of different components of diversity: https://www.flutterbys.com.au/stats/tut/tut13.2.html

Beta Diversity: https://besjournals.onlinelibrary.wiley.com/doi/10.1046/j.1365-2656.2003.00710.x