How good are statistical models at approximating complex fitness landscapes?
Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world quest...
Asıl Yazarlar: | , , |
---|---|
Materyal Türü: | Journal article |
Dil: | English |
Baskı/Yayın Bilgisi: |
Oxford University Press
2016
|
_version_ | 1826259104626114560 |
---|---|
author | du Plessis, L Leventhal, GE Bonhoeffer, S |
author_facet | du Plessis, L Leventhal, GE Bonhoeffer, S |
author_sort | du Plessis, L |
collection | OXFORD |
description | Fitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations.
|
first_indexed | 2024-03-06T18:44:36Z |
format | Journal article |
id | oxford-uuid:0e115768-4815-456a-9654-8152b0fffc85 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-06T18:44:36Z |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | dspace |
spelling | oxford-uuid:0e115768-4815-456a-9654-8152b0fffc852022-03-26T09:43:57ZHow good are statistical models at approximating complex fitness landscapes?Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:0e115768-4815-456a-9654-8152b0fffc85EnglishSymplectic ElementsOxford University Press2016du Plessis, LLeventhal, GEBonhoeffer, SFitness landscapes determine the course of adaptation by constraining and shaping evolutionary trajectories. Knowledge of the structure of a fitness landscape can thus predict evolutionary outcomes. Empirical fitness landscapes, however, have so far only offered limited insight into real-world questions, as the high dimensionality of sequence spaces makes it impossible to exhaustively measure the fitness of all variants of biologically meaningful sequences. We must therefore revert to statistical descriptions of fitness landscapes that are based on a sparse sample of fitness measurements. It remains unclear, however, how much data are required for such statistical descriptions to be useful. Here, we assess the ability of regression models accounting for single and pairwise mutations to correctly approximate a complex quasi-empirical fitness landscape. We compare approximations based on various sampling regimes of an RNA landscape and find that the sampling regime strongly influences the quality of the regression. On the one hand it is generally impossible to generate sufficient samples to achieve a good approximation of the complete fitness landscape, and on the other hand systematic sampling schemes can only provide a good description of the immediate neighborhood of a sequence of interest. Nevertheless, we obtain a remarkably good and unbiased fit to the local landscape when using sequences from a population that has evolved under strong selection. Thus, current statistical methods can provide a good approximation to the landscape of naturally evolving populations. |
spellingShingle | du Plessis, L Leventhal, GE Bonhoeffer, S How good are statistical models at approximating complex fitness landscapes? |
title | How good are statistical models at approximating complex fitness landscapes? |
title_full | How good are statistical models at approximating complex fitness landscapes? |
title_fullStr | How good are statistical models at approximating complex fitness landscapes? |
title_full_unstemmed | How good are statistical models at approximating complex fitness landscapes? |
title_short | How good are statistical models at approximating complex fitness landscapes? |
title_sort | how good are statistical models at approximating complex fitness landscapes |
work_keys_str_mv | AT duplessisl howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes AT leventhalge howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes AT bonhoeffers howgoodarestatisticalmodelsatapproximatingcomplexfitnesslandscapes |