Sarah Quesen

Overview

Sarah Quesen is an expert in statistics and psychometrics with a keen interest in emerging technologies. As Director of Assessment Research and Innovation (ARI), she leverages her understanding of assessment systems to lead rigorous, transformative research and provide evidence-based technical assistance to states, districts, and commercial organizations.

Much of Quesen’s research focuses on gathering and evaluating validity evidence for proposed test score interpretations and uses. She has extensive experience in evaluating AI scoring from a psychometric perspective, ensuring comparability between machine and human scores across student populations. Her current work includes proposing a framework for the evaluation of automated scoring, evaluating assessment content from GPT models, and developing a novel method using machine learning to examine measurement invariance.

Quesen has over 25 years of experience as an educator and continues to serve on the teaching faculty in the Department of Statistics at the University of Pittsburgh. Prior to joining WestEd, she was a senior research scientist at Pearson, where she served as lead psychometrician on complex, large-scale assessment contracts. Her extensive experience in both academic and operational settings drives her commitment to disrupting outdated assessment practices.

Education

PhD in research methodology, University of Pittsburgh
MPH in public health, West Virginia University School of Medicine
BS in culture and communication, New York University

Select Publications

White, L., Nesbitt, J., Roeters-Solano, H., Quesen, S., Lottridge, S., & Lochbaum, K. Culturally responsive and sustaining approaches to scoring. (in press) In C. Evans & C. Taylor (Eds.), Culturally responsive assessment in classrooms and large-scale contexts: Theory, research, and practice. NCME.

Quesen, S., Armstrong, P., & Timberlake, A. (2024, June 24—26). Leveraging generative AI in developing assessments that reflect all learners in your state [Paper presentation]. National Conference on Student Assessment, Seattle, WA, United States.

Quesen, S., & Lane, S. (2019). Differential item functioning for accommodated students with disabilities: Effect of differences in proficiency distributions. Applied Measurement in Education, 32(4), 337–349.

Steedle, J., Quesen, S., & Boyd, A. (2017) Longitudinal study of external validity of the PARCC performance levels: Phase I report. Pearson.