How well can we predict future changes in biodiversity using machine learning? Experiments in an eco-evolutionary testbed


Marine plankton communities are a fundamental component of the Earth system. They support food webs, drive global biogeochemical cycles, and play a central role in the regulation of Earth’s climate1.  These ecosystem functions are carried out by an extremely diverse microbial community, and predicted changes in biodiversity and biogeography are therefore likely to have important impacts within the Earth system. Species distribution models (SDMs) have proved to be a powerful method for estimating contemporary plankton biogeography and biodiversity. However, they make a number of important and somewhat unrealistic assumptions2, especially in relation to marine microbial communities. Most notably, they neglect the known capacity of microbial species to rapidly evolve, and they do not account for dispersal by ocean currents. With SDMs increasingly being applied to predict the future response of marine communities to environmental change it is important that these sources of error and uncertainty are quantified. The aim of this project is to use a simulated ‘virtual ecosystem’3to assess how evolution, dispersal and a range of other potentially important factors may impact the predictive skill of SDMs. This will improve our ability to understand how marine microbial biodiversity and biogeography will continue to be impacted by ongoing climate change.



The project will use output from a global ecosystem model3as a virtual ecosystem in which to test the validity of different species distribution models. “Synthetic observations” will be generated by sampling the ecosystem model to mimic real world ocean observations. These data will be used to train a range of different SDMs at different levels of sophistication (including generalised linear models, generalized additive models, Gaussian process models, random forests, neural networks, etc.). The accuracy of these models can be assessed by comparing their predictions to the known behaviour of the model in the year 2100. As well as comparing the predictive skill of different SDM techniques, it will also be possible to identify different sources of error. For example, to find out how much error is attributable to SDMs not accounting for evolution, the experiments can be repeated in a virtual environment for which evolution is switched off. The main aims of the project are therefore to evaluate the predictive skill of different SDM techniques for marine ecosystems, to account for their main sources of error, and ultimately to generate new SDM techniques that improve predictions by accounting for these errors.



The INSPIRE DTP programme provides comprehensive personal and professional development training alongside extensive opportunities for students to expand their multi-disciplinary outlook through interactions with a wide network of academic, research and industrial/policy partners. The student will be registered at the University of Southampton and hosted in the Ocean Biogeochemistry group based at the National Oceanography Centre in Southampton. Specific training will be given in the use of machine learning techniques, statistical analysis of their performance and the handling of large datasets of real-world observations and global model output. 


Eligibility & Funding Details: 

Please see for details.


Background Reading: 

1.  Falkowski, P. Ocean Science: The power of plankton. Nature 483, S17–S20 (2012).

2.  Wiens, J. A., Stralberg, D., Jongsomjit, D., Howell, C. A. & Snyder, M. A. Niches, models, and climate change: Assessing the assumptions and uncertainties. Proc. Natl. Acad. Sci. 106, 19729 (2009).

3.  Dutkiewicz, S. et al. Dimensions of Marine Phytoplankton Diversity. Biogeosciences 17, 609–634 (2020).