Data Analysis with R (DSC-2023-07) | 3 days


06.11. - 08.11.2023


09 AM - 4:30 PM

Workshop for PhD students and Postdocs


Speaker:
Fiona Katharina Ewald


Location:
Cartesium
Room 0.001

The workshop will be held in
English.

The workshop is already fully booked.

Also, if you have any questions regarding our workshops, please feel free to write us an E-MAIL.






« Back

OBJECTIVES

Participants will learn the basics in R, including a technical introduction into the R syntax. This course is suitable for participants with no knowledge of R or to refresh the basics in R. Participants will learn the most important concepts and terms in statistics and data analysis and how to carry out first exploratory and inferential statistical analyses in R.

WORKSHOP CONTENT

  • Working with R and RStudio
  • Installing and using extension packages in R
  • Introduction to help pages and tips for self-help
  • Explanation of the most important data types, operators (arithmetic & logical operators) and functions in R
  • Importing and exporting data
  • Working with data frames and vectors (numeric, logical, character, factors), e.g. indexing, splitting and converting variables or data sets
  • Calculating simple summary statistics in R (e.g. median, mean, quantiles, variance, etc.)
  • Definition of Data Science and other basic terms
  • Introduction to ggplot2 for data visualisation
  • Univariate descriptive statistics and data visualisation in R: frequency tables, bar charts, histograms, kernel density estimation, box plots, densities and distributions, QQ plots, etc.
  • Multivariate descriptive statistics and data visualisation in R: cross tables, scatter plots, correlation
  • Introduction to statistical inference: Point estimation, interval estimation and confidence intervals.
  • Motivation and overview of statistical hypothesis testing
  • Interpretation of results and explanation of terms related to hypothesis tests: Significance level, p-value, test statistic, etc.
  • Tests covered: t-test, Welch test (test for differences in means), Mann-Whitney U test or Wilcoxon rank sum test, Shapiro-Wilk test (test for normal distribution), Kolmogoroff-Smirnow test
  • Multiple testing: problems and solutions (e.g. Bonferroni correction)
  • Introduction to the linear regression model.
  • Model evaluation and model diagnosis: MSE, R-squared, QQ-plots and residuals analysis.
  • Outlook: Generalised linear models with a focus on logistic regression.

TARGET AUDIENCE

This course is suitable for participants with no knowledge of R or to refresh the basics in R. The workshop is very hands-on and thus limited to max. 15 participants.

TECHNICAL REQUIREMENTS

Use a laptop/PC with reliable internet access and install the following software:



ABOUT THE TRAINER

Fiona Katharina Ewald specializes in the field of Interpretable Machine Learning. She holds a Bachelor’s degree in Business Mathematics (B.Sc.) and a Master’s degree in Economics with a specialization in Statistics (M.Sc.), both of which she successfully completed at the University of Duisburg-Essen.