Data Analysis and Drug Design (3 ECTS)
(Persons in charge: A-C CAMPROUX & L. REGAD)
Program :
Optimize and combine different learning methods on drug design datasets Example on
molecule space
• Unsupervised methods: descriptive or exploratory methods that propose groupings into object
classes following algorithmic calculations
Factorial methods (Principal Component Analysis) and Classification methods (hierarchical or
partitioning)
• Supervised methods: explanatory and/or predictive:
Cross-validation protocol, optimization criteria.
Linear and PLS model, knn, CART, Logistic regression
Presentation of SVMs and opening towards deep learning
Targeted skills :
To teach students to combine and optimize different unsupervised and supervised learning methods to analyze drug design data. Both in the target space (structure-based application) and in the molecule space (ligand-based application) with the particular problems associated, descriptor selection and selection criteria, optimization, comparison and robustness of models in cross-validation.
Applications and projects will be done with different packages of the statistical software R.
Evaluation : Comparative reports on unsupervised methods on the one hand and supervised methods on the other hand + a presentation of articles or book chapters.