Data Simulation with Monte Carlo Methods

Marko Bachl

University of Hohenheim

Hello there :)

Introduction (I)

  • Hi, I’m Marko.

 

  • I am not a statistician, econometrician, psychometrician, or any kind of *ician.

  • Most importantly, I am also not a mathematician — this is why I often need simulation methods.

  • I am also not a computer scientist or trained programmer.

Introduction (II)

  • I am a social (communication) scientist with an interest in quantitative methods.

  • I use data simulation methods for teaching myself and others as well as in my applied and methods research.

Introduction (III)

Random sample in R

sample(x = 30, size = 4)
[1]  4 28 24  2
  • Who are you — and why are you here?

Traktandenliste

  1. Introduction & overview

  2. Monte Carlo Simulation?

  3. Proof by simulation — The Central Limit Theorem (CLT)

  4. Errors and power — Torturing the t-test

  5. Misclassification and bias — Messages mismeasured

  6. Outlook: What’s next?

Overview (I)

Introduction to Monte Carlo Simulation Methods

  • How to think about simulation experiments
  • How to get started in R

How we will cover applied examples

  • Simple, readable code, mostly {tidyverse}
  • Data simulation “from scratch”

Overview (II)

What we will not cover

  • Packages for simulation experiments and data simulation (easier to use, but harder to understand)
  • Advanced computing and programming stuff (not that hard to understand until its very hard to understand, very different skill set)
  • Some resources in the end

Workshop concept

  • Me talking: Lecture with code illustrations

  • You talking: Questions

  • All talking: Group exercises in breakout rooms

Main Workshop content

  • We will work through three examples of increasing complexity (proving the CLT, testing t-tests, and misclassification in content analysis).

  • While doing so, we will address many general issues on how to get started with Monte Carlo simulation, both conceptually and in practice with R.

Resources

Slides, scripts, and exercises:

R & packages:

R (Vers. 4.2.1), dplyr (Vers. 1.0.9), extraDistr (Vers. 1.9.1), forcats (Vers. 0.5.1), ggplot2 (Vers. 3.4.0), hrbrthemes (Vers. 0.8.6), knitr (Vers. 1.39), mgcv (Vers. 1.8.40), nlme (Vers. 3.1.157), purrr (Vers. 0.3.4), readr (Vers. 2.1.2), stringr (Vers. 1.5.0), tibble (Vers. 3.1.8), tictoc (Vers. 1.1), tidyr (Vers. 1.2.0), tidyverse (Vers. 1.3.2)

Questions?