Solution Exercise 2b

Data Simulation with Monte Carlo Methods

Marko Bachl

University of Hohenheim

Group exercise 2b

Exercise 2: Welch’s t-test for count data

Part 2: Power

  • We have shown in Part 1 of the exercise that the false discovery rates of t-tests and Poisson regression are similar even if the true data generating process is drawing from a Poisson distribution.

  • In Part 2, we want to compare the statistical power of the methods across a range of plausible scenarios.

Exercise 2b: Power

  • We want to compare the statistical power of

    • Welch’s t-test and
    • Poisson regression with a dummy predictor
  • if the true data generating process is drawing from Poisson distributions with different rate parameters.

  • If necessary: Adapt the simulation function from Exercise 2a.

  • Choose a sensible set of parameter combinations for the Monte Carlo experiment.

Simulation function

# We use almost the same simulation function, 
# but we omit Student's t-test, because we already showed its
# problems with unequal group sizes and group SDs
sim_ttest_glm = function(n = 200, GR = 1, lambda1 = 1, M_diff = 0) {
  n1 = round(n / (GR + 1))
  n2 = round(n1 * GR)
  lambda2 = lambda1 + M_diff
  g1 = rpois(n = n1, lambda = lambda1)
  g2 = rpois(n = n2, lambda = lambda2)
  Welch = t.test(g1, g2)$p.value
  GLM = glm(outcome ~ group,
            data = data.frame(outcome = c(g1, g2),
                              group = c(rep("g1", n1), rep("g2", n2))),
            family = poisson)
  res = lst(Welch, GLM = coef(summary(GLM))[2, 4])
  return(res)
}
sim_ttest_glm(n = 100, M_diff = 0.3)
$Welch
[1] 0.6844947

$GLM
[1] 0.6573568

Conditions: First try

# It makes sense to vary group size ratio, lambda, and M_diff
# The positive relationship between n and power is trivial
conditions = expand_grid(GR = c(0.5, 1, 2), 
                         n = 200,
                         lambda1 = c(1, 10, 20),
                         M_diff = c(0.3, 0.5, 0.8)) %>% 
  rowid_to_column(var = "condition")
conditions
# A tibble: 27 × 5
   condition    GR     n lambda1 M_diff
       <int> <dbl> <dbl>   <dbl>  <dbl>
 1         1   0.5   200       1    0.3
 2         2   0.5   200       1    0.5
 3         3   0.5   200       1    0.8
 4         4   0.5   200      10    0.3
 5         5   0.5   200      10    0.5
 6         6   0.5   200      10    0.8
 7         7   0.5   200      20    0.3
 8         8   0.5   200      20    0.5
 9         9   0.5   200      20    0.8
10        10   1     200       1    0.3
# … with 17 more rows

Run simulation

# use smaller i for now because additional glm() needs time
set.seed(89)
i = 1000
tic()
sims = map_dfr(1:i, ~ conditions) %>%
  rowid_to_column(var = "sim") %>%
  rowwise() %>%
  mutate(p.value = list(sim_ttest_glm(n = n, M_diff = M_diff,
                                      GR = GR, lambda1 = lambda1))) %>% 
  unnest_longer(p.value, indices_to = "method")
toc()
61.052 sec elapsed

Results: Power

sims %>% 
  group_by(GR, lambda1, M_diff, method) %>% 
  summarise(P_p05 = mean(p.value < 0.05)) %>% 
  ggplot(aes(factor(GR), P_p05, color = method, group = method)) + 
  geom_point() + geom_line() +
  facet_grid(M_diff ~ lambda1, labeller = label_both) +
  labs(x = "Group size ratio", y = "Power")

Results: Power

Summary: First try

  • (In hindsight) obvious result: Power strongly decreases as the rate parameter \(\lambda\) in the control group increases.

  • Obvious, because the group standard deviations are \(\sqrt{\lambda_1}\) and \(\sqrt{\lambda_1 + \sf{M\_diff}}\).

  • t-test and Poisson regression did very similarly, as far as this can be seen against the strong differences by the rate in the control group.

  • Next, we want to vary a standardized mean difference, d.

Conditions: Second try

  • It is not straightforward to vary the effect size as defined by the mean difference divided by the pooled population standard deviation.

  • We opt for a simpler option: Standardization by the standard deviation in the control group, i.e., \(\sqrt{\lambda_1}\).

conditions = expand_grid(GR = c(0.5, 1, 2), 
                         n = 200,
                         lambda1 = c(1, 10, 20),
                         d = c(0.2, 0.4, 0.6)) %>% # standardized mean difference.
  mutate(M_diff = d * sqrt(lambda1)) %>% # absolute mean difference
  rowid_to_column(var = "condition")
conditions
# A tibble: 27 × 6
   condition    GR     n lambda1     d M_diff
       <int> <dbl> <dbl>   <dbl> <dbl>  <dbl>
 1         1   0.5   200       1   0.2  0.2  
 2         2   0.5   200       1   0.4  0.4  
 3         3   0.5   200       1   0.6  0.6  
 4         4   0.5   200      10   0.2  0.632
 5         5   0.5   200      10   0.4  1.26 
 6         6   0.5   200      10   0.6  1.90 
 7         7   0.5   200      20   0.2  0.894
 8         8   0.5   200      20   0.4  1.79 
 9         9   0.5   200      20   0.6  2.68 
10        10   1     200       1   0.2  0.2  
# … with 17 more rows

Run simulation

# use smaller i for now because additional glm() needs time
set.seed(525)
i = 1000
tic()
sims = map_dfr(1:i, ~ conditions) %>%
  rowid_to_column(var = "sim") %>%
  rowwise() %>%
  mutate(p.value = list(sim_ttest_glm(n = n, M_diff = M_diff,
                                      GR = GR, lambda1 = lambda1))) %>% 
  unnest_longer(p.value, indices_to = "method")
toc()
63.788 sec elapsed

Results: Power

sims %>% 
  group_by(GR, lambda1, d, method) %>% 
  summarise(P_p05 = mean(p.value < 0.05)) %>% 
  ggplot(aes(factor(GR), P_p05, color = method, group = method)) + 
  geom_point() + geom_line() +
  facet_grid(d ~ lambda1, labeller = label_both, scales = "free_y") +
  labs(x = "Group size ratio", y = "Power")

Results: Power

Summary: Second try

  • Welch’s t-test and Poisson regression perform very similarly in terms of statistical power.

  • If the only goal is to compare some group means, the simpler t-test (or a linear model with heteroskedasticity-consistent standard errors) might be good enough.

Questions?