Program

Session 1: Introduction to Supervised Learning —- Ricardo Silva

What Machine Learning Is About
• Key General Concepts of Predictive Modelling
• Kernel Methods
• Tree-based Methods and Boosting
• Deep Learning

Session 2: Introduction to Unsupervised learning —- Ricardo Silva

• Clustering
• Dimensionality Reduction
• Autoencoders and Sequence Learning

Session 3: Introduction to Python —- Alberto Manfreda

• Python basics: variables, expressions, indentation and comments
• Control flow and mathematical/logical operators
• Strings and basic string formatting
• Data structures (lists, tuples, dictionaries)
• Structured programming in Python

Session 4: Python for Data Science and Machine Learning —- Alberto Manfreda

• NumPy
• Pandas
• Matplotlib
• Scikit-learn – part I

Session 5: Fundamentals of Causal Inference —- Ricardo Silva

• What Causal Inference is About
• Languages for Expressing Causal Assumptions
• The Identification Problem, a Structural Causal Model Perspective
• Machine Learning for Estimating Causal Effects

Session 6: Advanced Methods in Machine Learning —- Ricardo Silva

• Further Developments in Deep Learning
• Principles of Graphical Models
• Basic Methods in Bayesian Nonparametrics
• An Overview of Reinforcement Learning

Session 7: Further Python Programming Techniques and Tools for Data Science
Alberto Manfreda

• Scikit-learn – part II
• Introduction to classes and objects
• PyTorch

Session 8: Data Science for Cancer Research I —- Maria De Iorio

• Introduction to cell biology and measurement technologies

Session 9: Data Science for Cancer Research II —- Maria De Iorio

• Statistical Inference for Genomic Studies
• Practical Session with Applications in Cancer Genomics

Session 10: Data Science for Cancer Research III —- Maria De Iorio

• Clustering and Data Integration
• Practical Session with Applications in Cancer Genomics


References:

• Sessions 1,2,5,6

a) G. James, D. Witten, T. Hastie and R. Tibshirani, “An Introduction to Statistical Learning”, b:
b) https://www.statlearning.com
c) Zhang et al., “Dive into Deep Learning”, https://d2l.ai J. Pearl, M. Glymour and N. Jewell, “Causal Inference in Statistics, a Primer”. http://bayes.cs.ucla.edu/PRIMER/

Sessions 3,4,7

https://docs.python.org/3/tutorial
https://pytorch.org/tutorials/
https://jakevdp.github.io/PythonDataScienceHandbook

• Sessions 8-10
a) Statistical Genomics. Methods and Protocols. Editors: Mathé, Ewy, Davis, Sean (Eds.)
b) Statistical Population Genomics. Editors :Julien Y. Dutheil Xu, Shizhong . Principles of Statistical Genomics.
Springer


Biosketches:

Ricardo Silva
is a Professor of Statistical Machine Learning and Data Science at the Department of Statistical Science, UCL. He also holds a Adjunct Faculty position at the Gatsby Computational Neuroscience, UCL, and a Faculty Fellowship at the Alan Turing Institute. Ricardo obtained a PhD in Machine Learning from Carnegie Mellon University, 2005, followed by postdoctoral positions at the Gatsby Unit and at the Statistical Laboratory, University of Cambridge. His main interests are on causal inference, graphical models, and probabilistic machine learning. His research has received funding from organisations such as EPSRC, Innovate UK, the Office of Naval Research, Winton Research and Adobe Research. Ricardo has also served in the senior program committee of several top machine learning conferences, including acting as a Senior Area Chair at the NeurIPS and ICML conferences and being a Program Chair and Conference Chair for the Uncertainty in Artificial Intelligence conference.”

Affiliation:
Department of Statistical Science and Adjunct Faculty of the Gatsby Computational Neuroscience Unit, UCL
e-mail: ricardo.silva@ucl.ac.uk

Alberto Manfreda
is a PostDoc researcher at INFN (Istituto Nazionale di Fisica Nucleare), Pisa.
After obtaining a PhD in Physics at the University of Pisa in 2018, he has been working in the field of cosmic-ray science and high-energy astrophysics, formerly as a member of the Fermi – Large Area Telescope collaboration and currently for the NASA Imaging X-Ray Polarimetry Explorer mission.
Passionate programmer, he has been guest lecturer of advanced Python in a course of Computing Methods for Experimental Physics and Data Analysis at the University of Pisa.

Affiliation
Istituto Nazionale di Fisica Nucleare, sez. Pisa
e-mail: alberto.manfreda@pi.infn.it

Maria De Iorio
Professor De Iorio has extensive expertise in Bayesian statistics, Bayesian nonparametrics, biostatistics & computational methods.  She has a long track-record in modelling complex biomedical data and analysing high throughput data in genomics/metabolomics. 

Affiliation :
Yong Loo Lin School of Medicine, National University of Singapore
Singapore Institute for Clinical Sciences (SICS), A*STAR
Department of Statistical Science, University College London Yale-NUS College, Singapore
e-mail: mdi@nus.edu.edu.sg