Harnessing deep learning for population genetic inference – Nature.com

Wakeley, J. The limits of theoretical population genetics. Genetics 169, 17 (2005).

Article PubMed PubMed Central Google Scholar

Lewontin, R. C. Population genetics. Annu. Rev. Genet. 1, 3770 (1967).

Article Google Scholar

Fu, Y.-X. Variances and covariances of linear summary statistics of segregating sites. Theor. Popul. Biol. 145, 95108 (2022).

Article PubMed PubMed Central Google Scholar

Bradburd, G. S. & Ralph, P. L. Spatial population genetics: its about time. Annu. Rev. Ecol. Evol. Syst. 50, 427449 (2019).

Article Google Scholar

Ewens, W. J. Mathematical Population Genetics I: Theoretical Introduction 2nd edn (Springer, 2004). This classic textbook covers theoretical population genetics ranging from the diffusion theory to the coalescent theory.

Crow, J. F. & Kimura, M. An Introduction to Population Genetics Theory (Blackburn Press, 2009). This classic textbook introduces the fundamentals of theoretical population genetics.

Pool, J. E., Hellmann, I., Jensen, J. D. & Nielsen, R. Population genetic inference from genomic sequence variation. Genome Res. 20, 291300 (2010).

Article CAS PubMed PubMed Central Google Scholar

Charlesworth, B. & Charlesworth, D. Population genetics from 1966 to 2016. Heredity 118, 29 (2017).

Article CAS PubMed Google Scholar

Johri, P. et al. Recommendations for improving statistical inference in population genomics. PLoS Biol. 20, e3001669 (2022).

Article CAS PubMed PubMed Central Google Scholar

The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 6874 (2015).

Article Google Scholar

Mallick, S. et al. The Allen Ancient DNA Resource (AADR): a curated compendium of ancient human genomes. Preprint at bioRixv https://doi.org/10.1101/2023.04.06.535797 (2023).

The 1001 Genomes Consortium. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166, 481491 (2016).

Article Google Scholar

Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

Article PubMed PubMed Central Google Scholar

Walters, R. G. et al. Genotyping and population characteristics of the China Kadoorie Biobank. Cell Genom. 3, 100361 (2023).

Schrider, D. R. & Kern, A. D. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 34, 301312 (2018). This review covers the applications of supervised learning in population genetic inference.

Article CAS PubMed PubMed Central Google Scholar

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436444 (2015).

Article CAS PubMed Google Scholar

Gao, H. et al. The landscape of tolerated genetic variation in humans and primates. Science 380, eabn8153 (2023).

Article CAS PubMed Google Scholar

van Hilten, A. et al. GenNet framework: interpretable deep learning for predicting phenotypes from genetic data. Commun. Biol. 4, 1094 (2021).

Article PubMed PubMed Central Google Scholar

Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems 30, NIPS 2017 (eds Guyon, I. et al.) 59996009 (NIPS, 2017). This study proposes the vanilla transformer architecture, which has become the basis of novel architectures that achieve state-of-the-art performance in different machine learning tasks.

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. 32nd International Conference on Machine Learning Vol. 37 (eds Bach, F. & Blei, D.) 22562265 (PMLR, 2015).

Nei, M. in Molecular Evolutionary Genetics 327403 (Columbia Univ. Press, 1987).

Hamilton, M. B. in Population Genetics 5367 (Wiley-Blackwell, 2009).

Kimura, M. Diffusion models in population genetics. J. Appl. Probab. 1, 177232 (1964).

Article Google Scholar

Kingman, J. F. C. On the genealogy of large populations. J. Appl. Probab. 19, 2743 (1982).

Article Google Scholar

Rosenberg, N. A. & Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet. 3, 380390 (2002).

Article CAS PubMed Google Scholar

Fu, Y.-X. & Li, W.-H. Maximum likelihood estimation of population parameters. Genetics 134, 12611270 (1993).

Article CAS PubMed PubMed Central Google Scholar

Griffiths, R. C. & Tavar, S. Monte Carlo inference methods in population genetics. Math. Comput. Model. 23, 141158 (1996).

Article Google Scholar

Tavar, S., Balding, D. J., Griffiths, R. C. & Donnelly, P. Inferring coalescence times from DNA sequence data. Genetics 145, 505518 (1997).

Article PubMed PubMed Central Google Scholar

Marjoram, P. & Tavar, S. Modern computational approaches for analysing molecular genetic variation data. Nat. Rev. Genet. 7, 759770 (2006).

Article CAS PubMed Google Scholar

Williamson, S. H. et al. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl Acad. Sci. USA 102, 78827887 (2005).

Article CAS PubMed PubMed Central Google Scholar

Wang, M. et al. Detecting recent positive selection with high accuracy and reliability by conditional coalescent tree. Mol. Biol. Evol. 31, 30683080 (2014).

Article CAS PubMed Google Scholar

Szpiech, Z. A. & Hernandez, R. D. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol. Biol. Evol. 31, 28242827 (2014).

Article CAS PubMed PubMed Central Google Scholar

Maclean, C. A., Hong, N. P. C. & Prendergast, J. G. D. hapbin: an efficient program for performing haplotype-based scans for positive selection in large genomic datasets. Mol. Biol. Evol. 32, 30273029 (2015).

Article CAS PubMed PubMed Central Google Scholar

Huang, X., Kruisz, P. & Kuhlwilm, M. sstar: a Python package for detecting archaic introgression from population genetic data with S*. Mol. Biol. Evol. 39, msac212 (2022).

Article CAS PubMed PubMed Central Google Scholar

Borowiec, M. L. et al. Deep learning as a tool for ecology and evolution. Methods Ecol. Evol. 13, 16401660 (2022).

Article Google Scholar

Korfmann, K., Gaggiotti, O. E. & Fumagalli, M. Deep learning in population genetics. Genome Biol. Evol. 15, evad008 (2023).

Article PubMed PubMed Central Google Scholar

Alpaydin, E. in Introduction to Machine Learning 3rd edn (eds Dietterich, T. et al.) 120 (MIT Press, 2014).

Bengio, Y., LeCun, Y. & Hinton, G. Deep learning for AI. Commun. ACM 64, 5865 (2021).

Article Google Scholar

Sapoval, N. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13, 1728 (2022).

Article CAS PubMed PubMed Central Google Scholar

Bishop, C. M. Model-based machine learning. Philos. Trans. R. Soc. A 371, 20120222 (2013).

Article Google Scholar

Lee, C., Abdool, A. & Huang, C. PCA-based population structure inference with generic clustering algorithms. BMC Bioinform. 10, S73 (2009).

Article Google Scholar

Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493496 (2011).

Article CAS PubMed PubMed Central Google Scholar

Skov, L. et al. Detecting archaic introgression using an unadmixed outgroup. PLoS Genet. 14, e1007641 (2018).

Article PubMed PubMed Central Google Scholar

Chen, H., Hey, J. & Slatkin, M. A hidden Markov model for investigating recent positive selection through haplotype structure. Theor. Popul. Biol. 99, 1830 (2015).

Article PubMed Google Scholar

Lin, K., Li, H., Schltterer, C. & Futschik, A. Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics. Genetics 187, 229244 (2011).

Article PubMed PubMed Central Google Scholar

Schrider, D. R., Ayroles, J., Matute, D. R. & Kern, A. D. Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLoS Genet. 14, e1007341 (2018).

Article PubMed PubMed Central Google Scholar

Durvasula, A. & Sankararaman, S. A statistical model for reference-free inference of archaic local ancestry. PLoS Genet. 15, e1008175 (2019).

Article CAS PubMed PubMed Central Google Scholar

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016). This classic textbook introduces the fundamentals of deep learning.

Eraslan, G., Avsec, Z., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389403 (2019).

Article CAS PubMed Google Scholar

Villanea, F. A. & Schraiber, J. G. Multiple episodes of interbreeding between Neanderthals and modern humans. Nat. Ecol. Evol. 3, 3944 (2019).

Article PubMed Google Scholar

Unadkat, S. B., Ciocoiu, M. M. & Medsker L. R. in Recurrent Neural Networks: Design and Applications (eds Medsker, L. R. & Jain, L. C.) 112 (CRC, 1999).

Gron, A. Neural networks and deep learning (OReilly Media Inc., 2018).

Sheehan, S. & Song, Y. S. Deep learning for population genetic inference. PLoS Comput. Biol. 12, e1004845 (2016).

Article PubMed PubMed Central Google Scholar

Mondal, M., Bertranpetit, J. & Lao, O. Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania. Nat. Commun. 10, 246 (2019).

Article PubMed PubMed Central Google Scholar

Sanchez, T., Curry, J.,Charpiat, G. & Jay, F. Deep learning for population size history inference: design, comparison and combination with approximate Bayesian computation. Mol. Ecol. Resour. 21, 26452660 (2021).

Article PubMed Google Scholar

Tran, L. N., Sun, C. K., Struck, T. J., Sajan, M. & Gutenkunst, R. N. Computationally efficient demographic history inference from allele frequencies with supervised machine learning. Preprint at bioRixv https://doi.org/10.1101/2023.05.24.542158 (2023).

Article Google Scholar

See the original post here:
Harnessing deep learning for population genetic inference - Nature.com

Related Posts

Comments are closed.