Efficiently Exploring Multilevel Data with Recursive Partitioning

Martin, Daniel

Efficiently Exploring Multilevel Data with Recursive Partitioning 641 views

Author

Martin, Daniel, Psychology - Graduate School of Arts and Sciences, University of Virginia

Advisors

Von Oertzen, Timo , Department of Psychology , University of Virginia

Abstract

There is an increasing number of datasets with many participants, many variables, or both, found in education and other areas that commonly have complex, multilevel data structures. Once initial confirmatory hypotheses are exhausted, it can be difficult to determine how best to explore these datasets to discover hidden relationships that could help to inform future research. Typically, exploratory data analysis in these areas are performed with the very same statistical tools as confirmatory data analysis, leading to potential methodological issues such as increased rates of false positive findings. In this dissertation, I argue that the utilization of a data mining framework known as recursive partitioning offers a more efficient means to perform exploratory data analysis to identify variables that may have been overlooked initially in the confirmatory data analysis phase. By adopting such a non-parametric approach, researchers can easily identify the extent to which all variables are related to an outcome, rather than rely on null hypothesis significance tests as a strict dichotomization of whether a given variable is "important" or "unimportant." This dissertation evaluates the feasibility of using these methods in multilevel contexts commonly found in social science research by using both Monte Carlo simulations and three applied datasets. Based on these results, a set of best practices was constructed and disseminated via a small workshop given to applied researchers. Feedback from these researchers helped lead to a publicly available tutorial and R package to assist others interested in adding this technique to their own statistical toolbox.

Degree

PHD (Doctor of Philosophy)

Keywords

random forests; data mining; exploratory data analysis

Rights

Issued Date

2015-06-12

Persistent Link

https://doi.org/10.18130/V3PS0F

Suggested Citation

Martin, Daniel. Efficiently Exploring Multilevel Data with Recursive Partitioning. University of Virginia, Psychology - Graduate School of Arts and Sciences, PHD (Doctor of Philosophy), 2015-06-12, https://doi.org/10.18130/V3PS0F.

Files

Martin_Daniel_2015.pdf

Downloads: 3434

Download