In the fields of chemistry and biology, application of data science approaches and ML methods may dramatically enhance our ability to design molecules that can tune the functions of biological systems, with implications for the development of new drugs, novel processes, and for our ability to model highly complex systems. The aim of this day is to provide students with the necessary tools to enable them to make use of the predictive power and the interpretability of models; to be able to critically understand the correlations between structure and properties of biological systems; to apply advanced methods of data science and machine learning to drug design. These aims will be achieved through presentations by high profile invited speakers and group work in which the students will be discussing with the experts on questions such as: ‘What are important factors for successfully being able to apply ML/AI in any project? Can you come up with a new application of AI/ML within a drug discovery setting’
Medicinal Chemistry – The Past, The Present and The Future. Drug design has changed significantly over past decades influenced by scientific discoveries and technological advancements. The advent of new cool technological methods and the increased use of automation have also played a large part in this development. Computer programs with clever algorithms can help shorten the process from idea to medicine, by making the process easier and faster with the hope of better end results – effective and safe drugs. Many of today’s medicinal chemists embrace this mindset and use computers as their main tool in every day work. Right now artificial intelligence (AI) is in vogue, and we at AstraZeneca are investigating how to best use it. In this talk, I will discuss how medicinal chemistry – chemical synthesis and molecular design – has changed over the years, including presenting a new de novo design (AI) method show-casing real-life project examples
Speaker:
Jonas Bostrom
Jonas Boström is a computational medicinal scientist who currently works as a principal scientist at AstraZeneca, Gothenburg, Sweden. He gained his M.Sc. in Chemistry at the University of Göteborg (Sweden) in 1996, and Ph.D. in computational medicinal chemistry with professor Tommy Liljefors, at the University of Copenhagen, Denmark in 2000. Shortly after finishing he joined the computational chemistry group at H. Lundbeck A/S (Denmark) before moving to AstraZeneca (Sweden), where he has stayed since in various roles. During his 20 years in the pharmaceutical industry he’s been exposed to most aspects of pre-clinical drug discovery. This has resulted in being named co-inventor on 14 patent specifications (10 candidate drugs, including one launched drug). Main interests include all aspect of drug discovery’s digital chemistry future. Some of this is summarized in +35 peer-reviewed articles. Also, Jonas is a assoc. professor in Medicinal Chemistry at the University of Gothenburg and (adjunt) professor of Information Technology at the University of Skövde and CEO and co-founder of EduChem VR, a start-up gamifying chemistry education using virtual reality.
Affiliation: CVMD Innovative Medicines, IMED Biotech Unit, AstraZeneca, Mölndal SE-431 83, Sweden E-mail: Jonas.Bostrom@astrazeneca.com
Bayesian Inference and Gaussian Process Regression in Physical Chemistry. In my lesson I will first provide an essential introduction on the theory of Bayesian inference and Gaussian process regression, stressing the importance of controlling prior information and estimating prediction uncertainty to build accurate, transferable, and interpretable models in physics and chemistry. I will then show how GP regression can be (and has been) applied to learn and predict properties of atoms and molecules. In order to do so, one first needs to represent atomic positions in a compact vector of numbers: a descriptor. We will analyse and compare the most well known descriptors, and then study some of the problems where they have been successfully applied in conjunction with GP regression, with notable examples being the predictions of atomisation energies of molecules and the interpolation of atomic force fields. During the lesson I will encourage the development of a practical understanding of the tools and concepts discussed through the use and the extension of specifically prepared snippets of code.
Reference:
Rupp, International Journal of Quantum Chemistry (2015), “Machine learning for quantum mechanics in a nutshell” Zeni et al., Advances in Physics X (2019), “On machine learning force fields for metallic nanoparticles” (Sections 2 and 3) Himanen et al., Computer Physics Communications (2020), “DScribe: Library of descriptors for machine learning in materials science”
Speaker:
Aldo Glielmo
Aldo Glielmo is a postdoctoral researcher at the International School for Advanced Studies (SISSA). He is currently working on the developments of computational methods to characterise the high-dimensional manifolds of datasets endowed with a distance metric, with applications to the analysis of materials databases as well as to the theory of deep neural networks. He obtained his Ph.D from King’s College London where he developed machine learning models for interatomic potentials and for many-body wave functions. He has also visited the International Centre for Theoretical Physics (ICTP) and the Alan Turing Institute, carrying out research on Bayesian model selection and spectral clustering.
Affiliation: International School for Advanced Studies (SISSA) E-mail: aglielmo@sissa.it
Exploration of Molecular Recognition Processes Using Machine Learning Ligand and structure-based drug design (LBDD and SBDD) tools and approaches serve as a foundation technology in drug discovery, design and optimization. In this presentation, we will discuss what drives the use of, in particular, computational LBDD and SBDD tools in an endeavor that is at its core experimental. Docking (so-called “posing”) calculations coupled with binding free energy estimates (scoring) has emerged as a key technology in SBDD.Docking and scoring methods have steadily improved over the years, but remain a challenge because of the extensive sampling that is required, the need for accurate scoring functions and difficulties encountered in accurately estimating entropy effects. To address these issues free energy perturbation methods, first described in the context of SBDD in the late 1980’s, has enjoyed a renaissance and will be briefly discussed. To further address these issues, we have been developing a number of novel strategies in our laboratory. In particular, we will describe the use of machine learning (ML) techniques to discriminate between native and non- native protein structure and native and non-native poses for protein-ligand complexes. Specifically we use a knowledge-based derived potential or a physics-based potential function (e.g., an AMBER force field) combined with Random Forest ML techniques to derive models that excel at these tasks. Interestingly, we find that the strength (or well depth) of the potential function interaction plays only a minor role in the capability of the derived models and that what is important is the location of the potential minima. We will describe the results of our ML studies and discuss the future role ML and deep learning techniques will play to address problems of this type
Speaker:
Kenneth M. Merz Jr.
Kenneth M. Merz, Jr. is currently the Director of the Institute for Cyber Enabled Research (iCER) and the Joseph Zichis Chair in Chemistry at Michigan State University. Since the start of 2014, he has been the Editor-in-Chief of the Journal of Chemical Information and Modeling, which is part of the American Chemical Society suite of chemistry journals. His research interest liesin the development of theoretical and computational tools and their application to biological problems including structure and ligand based drug design, mechanistic enzymology and methodological verification and validation (i.e., error analysis). In his research he makes extensive use of quantum and molecular mechanical potential functions coupled with a variety of numerical methods including molecular dynamics and Monte Carlo approaches. He also makes extensive use of structure-based drug design tools (e.g., docking and QSAR) along with chemi and bioinformatics resources in his research program. He has also been heavily involved in software development (e.g., parallel computing, GPU programming, etc.) aimed at taking advantage of high-performance computing (HPC) resources to solve chemical and biological problems. As a result of his research efforts he has published over 300 papers and given over 300 lectures worldwide describing his research. Prior to his current position at Michigan State University he was a University of FloridaResearch Foundation Professor, the Edmund H. Prominski Professor of Chemistry, the Colonel Allan R. and Margaret G. Crow Term Professor and a Member of the Quantum Theory Project all at the University of Florida from 2005-2013. Prior to the University of Florida he was an Assistant, Associate and Professor of Chemistry at the Pennsylvania State University from 1989-2005. He also has worked in industry (1998-2001) first as the Senior Director of the Center for Informatics and Drug Discovery (CIDD) at Pharmacopeia, Inc. (now part of Ligand, Inc.) and then as the Senior Director of the ADMET Research and Development Group in the Accelrys software division of Pharmacopeia (now part of Dassault Systémes and renamed BIOVIA). He is the founder of the software company QuantumBio, Inc located in State College, Pennsylvania. Dr. Merz carried out postdoctoral training at The University of California, San Francisco (1987-1989, with Peter Kollman) and at Cornell University (1986-1987, with Roald Hoffmann). He received his Ph.D. in Organic Chemistry at The University of Texas at Austin in 1985 (with M. J. S. Dewar) and his B.S. from Washington College, Chestertown, Maryland in 1981. He has received a number of honors including, election as the 2013 Chair of the COMP division of the ACS, election as an ACS Fellow, the 2010 ACS Award for Computers in Chemical and Pharmaceutical Research, election as a fellow of the American Association for the Advancement of Science, a John Simon Guggenheim Fellowship and he has held visiting professorships at Imperial College (London, England) the Institute for Research in Biomedicine (Barcelona, Spain), École Polytechnique (Paris, France), University of Florence (Florence, Italy), The University of Strasbourg (Strasbourg, France), The University of Oviedo (Oviedo, Spain) and the ETH (Zurich, Switzerland)
Affiliation: Editor-in-Chief, Journal of Chemical Information and Modeling Joseph Zichis Chair in Chemistry Department of Chemistry Department of Biochemistry and Molecular Biology Michigan State University 578 S. Shaw Lane East Lansing, MI 48824-1322 E-Mail: merz@chemistry.msu.edu; kmerz1@gmail.com