Supervised Machine Learning in R (DSC-2023-08) | 2,5 days


20.11. - 23.11.2023


09 - 13 Uhr


Workshop für Promovierende und Postdocs


Referent*in:
Fiona Katharina Ewald

Ort:
Online (Zoom)


Sprache: Englisch

Der Workshop ist bereits ausgebucht.

Bei Fragen zu unseren Workshops, schreiben Sie uns gerne eine E-MAIL.






« zurück

OBJECTIVES

This course covers fundamental concepts and advanced techniques in machine learning. Participants will learn to train and evaluate supervised learning models, explore various supervised ML algorithms, and gain practical skills in interpreting complex machine learning algorithms. It is aimed at experienced R users.

WORKSHOP CONTENT

  • General tasks in machine learning (regression, classification, clustering, etc.)
  • Introduction to fundamental terms (loss function, risk minimization, overfitting, hyperparameters, training and test data, etc.)
  • Linear and logistic regression from ML perspective and the K-NN algorithm
  • Important evaluation metrics for regression and classification and their characteristics
  • Resampling methods (cross-validation, bootstrap, etc.) and their pros and cons
  • Use Case: Training a first simple model, making predictions, measuring performance
  • Use Case: Resampling and Benchmarking of ML Algorithms in R
  • Functionality of simple key machine learning algorithms:
  • Regression and classification trees
  • Random Forests
  • Hyperparameter optimization (random search and grid search)
  • Nested cross-validation for optimal model selection
  • Pitfalls and practical tips in model evaluation and selection
  • Use Case: Training and comparing decision trees and random forests
  • Use Case: Proper model selection based on nested resampling
  • Motivating model-agnostic interpretability and their need in practical ML applications
  • Motivation and overview of statistical hypothesis testing
  • Introduction to basic terms such as local and global interpretability
  • Feature importance methods to quantify the relevance of features
  • Multiple testing: problems and solutions (e.g. Bonferroni correction)
  • Feature effect methods to visualize the local and global feature effect
  • Final use case

TARGET AUDIENCE

PhD students and postdocs with very good R knowledge and data analysis skills in R (proficiency in R). General understanding of data analysis/statistics is required. The workshop is very hands-on and thus limited to max. 15 participants.

TECHNICAL REQUIREMENTS

Use a laptop/PC with reliable internet access and install the following software:



ABOUT THE TRAINER

Fiona Katharina Ewald specializes in the field of Interpretable Machine Learning. She holds a Bachelor’s degree in Business Mathematics (B.Sc.) and a Master’s degree in Economics with a specialization in Statistics (M.Sc.), both of which she successfully completed at the University of Duisburg-Essen.




Das Data Science Center wird gefördert vom:
Förderhinweis BMBF Förderhinweis EU