Methodology

Simon Hoellerbauer. “A Mixture Model Approach to Assessing Measurement Error in Surveys Using Reinterviews.”* Journal of Survey Statistics and Methodology. 2023.

Researchers are often unsure about the quality of the data collected by third-party actors, such as survey firms. They are reliant on survey firms to provide them with estimates of data quality and to identify observations that are problematic, potentially because they have been falsified or poorly collected. This may be because of the inability to measure data quality effectively at scale and the difficulty with communicating which observations may be the source of measurement error. To address these issues, I propose the QualMix model, a mixture modeling approach to deriving estimates of survey data quality in situations in which two sets of responses exist for all or certain subsets of respondents. I apply this model to the context of survey reinterviews, a common form of data quality assessment used to detect falsification and data collection problems during enumeration. Through simulation based on real-world data, I demonstrate that the model successfully identifies incorrect observations and recovers latent enumerator and survey data quality. I further demonstrate the model's utility by applying it to reinterview data from a large survey fielded in Malawi, using it to identify significant variation in data quality across observations generated by different enumerators.

Link to Paper at JSSAM Draft Used For Publication Replication Archive

Brandon de la Cuesta, Jim Qian, and Simon Hoellerbauer. “Enumerator Effects in Experimental Research: Causes and Consequences for Inference in Survey and Lab-in-the-Field Experiments.”*

This project investigates an often ignored source of variation in survey and lab-in-field experiments: enumerators. We explore how ignoring or simply using enumerator fixed effects can lead to misleading results and affect inference. We theorize enumerators as carrying out different "versions" of experimental treatments. We develop a Bayesian model for taking enumerator effects into account when analyzing experimental and survey data.

Simon Hoellerbauer. “Modeling Civic Engagement with Organizations: A Novel Application of Conjoint Survey Experiments.”* Presented at 37^th Annual Meeting of the Society for Political Methodology (Virtual), July 2020.

Often, we would like to place individuals in a latent space relative to other individuals or other fixed points within that space - politicians, tax plans, organizations, governments etc. We then want to know how the distance between an individual and such a fixed points affects that individual's attitudes and behaviors. Often, the location of these points in latent space are influenced by a constellation of attributes - policy positions for politicians, tax rates and coverages for tax plans, structure, membership, goals for organizations, performance in different categories for governments. This is really a **two-part process**, where closeness, decided by how individuals view an entity's traits, is the mechanism. I propose a methodological approach to studying this process: a two-part statistical model, where the first is an IRT model and the second is only limited by the nature of the secondary outcome variable. This project involves a novel conjoint study with substantive implications for individual engagement with civil society organizations. Analysis is done in the Bayesian framework, with both parts of the model estimated together. I situate this new approach this in the analysis of conjoint survey experiments, but it could be adapted to a diverse array of experimental approaches.

Working Paper PolMeth 2020 Poster PolMeth 2020 Slides

Simon Hoellerbauer. “How Many Is Too Many? Outcome Questions In Conjoint Survey Experiments.” Presented at 39^th Annual Meeting of the Society for Political Methodology, July 2022.

More and more political scientists are using conjoint survey experiments in their work. As a result, political methodologists have begun to examine the best ways to employ them. A considerable amount of research has looked at how the number of conjoint tasks (profile-pairs seen) and the number of attributes impact data quality. However, few studies, if any, have looked at the relationship between the number of outcome questions associated with each conjoint task and data quality. While some practitioners urge caution with the number of outcome questions after each profile-pair to limit the cognitive load on the respondent, others suggest that the marginal cost of gathering additional outcomes once the task has been explained is very low. In this study, I use an experiment to test how the number of outcome questions and its interaction with the type of question - forced choice versus rating - impact data quality and consistency.

Poster

Simon Hoellerbauer and Isabel Laterzo-Tingley. “Post Hoc Synthetic Purposive Sampling for Post Hoc External Validity Assessment.” Presented at 41^st Annual Meeting of the Society for Political Methodology, July 2024.

Egami and Lee (2024) propose synthetic purposive sampling (SPS) to help researchers pick sites for multi-site studies that maximize the ability to generalize to a larger population of sites. With a proactive focus, they understandably advocate researchers use SPS during the design stage. They demonstrate how feasibility and practicality constraints can be included in SPS to reflect real-world conditions, which increases the applicability of the method. At the same time, many multi-site studies that did not use SPS for site selection and instead relied on purposive or convenience samples have already been completed. In addition, in some cases, researchers face strict constraints. For example, because researchers may only have connections in certain countries, they can only ever choose those countries. Under such conditions, how can we get a sense of the external validity of the chosen sites? In either case, we may want to evaluate what kind of populations to which a sample could generalize and to compare the study sample to an ideal sample. In this paper, we propose and evaluate the utility of \textit{post hoc} SPS: performing SPS on the \textit{unconstrained} target population of all fitting valid sites after a multi-site study has been completed.

Poster

Simon Hoellerbauer. “Democracy vs Dictatorship or Something More?: Using Unsupervised Learning to Cluster Regimes.”

Political scientists categorize regimes because they believe there are important descriptive and causal differences between them, otherwise there would be no point to the exercise. There are, however, numerous-at times conflicting, at times overlapping-categorizations of regimes used in the political science literature. Often we rely on subjective coding that is also very time-consuming. It is not clear which categorizations are more "important," in the sense that they reflect intrinsically different regime types-researchers can make subjective decisions about which "aspect" of a regime may be more important, although this may not be reflected in execution. In this project, I use the wealth of indicators available in the V-Dem project to cluster regimes via K-Means clustering and Guassian mixture models. I find disagreement between existing categorical measures and the clusters returned by both algorithms.

Poster

Simon Hoellerbauer. 2018. Reconceptualizing Civil Society and its Strength. Master’s Thesis, UNC-Chapel Hill.

What is Civil Society? Can we assess how strong it is? Using the problems present conceptualizations of civil society entail as a point of departure, this work develops a deffnition that strips civil society of its normative assumptions and functional form and fits better with the reality we observe. Civil society can be thought of as a space between the state, the market, and the family that can be divided into different sectors based on the goals of the civil organizations that inhabit it. The strength of each sector can be assessed by gauging how cohesive civil society organizations within that sector are, how embedded they are in the social fabric of society, and how developed their bureaucratic capital is. This work then sketches out how this approach can be used to analyze civil society in the United States and Armenia. In sum, it presents the basis for a new research agenda aims to investigate the relationship between civil society and democracy.

Thesis

*Part of dissertation