Supplementary Materials for “Vaccine-related Discussions in Online Communities for Parents”
2020-02-25
Section 1 Introduction
In this document, we provide some supplementary materials, including documentation on data collection, model selection, and an overview of all topics. Please note that we cannot provide full replication data because of concerns with copyright and privacy of the users of the online communities. We provide the document-term-matrix and some meta data for the posts (platform, publication date), which were used for the estimation of the structural topic models, to enable computational reproducibility. We will give requests from researchers for access to more detailed data full consideration.
1.1 R packages
R 3.6.0 (R Core Team 2019) was used for all computer-based steps of the research process. The data collection was carried out with the package rvest 0.3.4 (Wickham 2019). Several packages from the tidyverse 1.2.1 (Wickham 2017) were used for data manipulation and plotting. The text corpus was prepared with quanteda 1.5.0 (Benoit et al. 2019). The topic models were fitted with stm 1.3.3 (Roberts et al. 2018) and inspected with tidytext 0.2.1 (Robinson and Silge 2019) and stminsights 0.3.0 (Schwemmer 2018). furrr 0.1.0 (Vaughan and Dancho 2018) was used for parallel computing. The documentation was created with bookdown 0.16 (Xie 2019).
1.2 OSF repository
The source code to build this documentation, including replication data and R scripts, is available at the Open Science Framework: https://osf.io/twx38/. A preprint of the article is available at https://doi.org/10.31219/osf.io/ad9h7.
References
Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, Akitaka Matsuo, et al. 2019. Quanteda: Quantitative Analysis of Textual Data. https://CRAN.R-project.org/package=quanteda.
R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Roberts, Margaret, Brandon Stewart, Dustin Tingley, and Kenneth Benoit. 2018. Stm: Estimation of the Structural Topic Model. https://CRAN.R-project.org/package=stm.
Robinson, David, and Julia Silge. 2019. Tidytext: Text Mining Using ’Dplyr’, ’Ggplot2’, and Other Tidy Tools. https://CRAN.R-project.org/package=tidytext.
Schwemmer, Carsten. 2018. Stminsights: A ’Shiny’ Application for Inspecting Structural Topic Models. https://CRAN.R-project.org/package=stminsights.
Vaughan, Davis, and Matt Dancho. 2018. Furrr: Apply Mapping Functions in Parallel Using Futures. https://CRAN.R-project.org/package=furrr.
Wickham, Hadley. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.
Wickham, Hadley. 2019. Rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.
Xie, Yihui. 2019. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.
University of Hohenheim, marko.bachl@uni-hohenheim.de↩
HMTM Hanover, elena.link@ijk.hmtm-hannover.de↩