New Directions: Bridging Natural Language Processing (NLP) and Survey Research at SurvAI-Day

The SurvAI Workshop is hosted by the Social Data Science Center (SoDa) and the Artificial Intelligence Interdisciplinary Institute (AIM) at the University of Maryland with support from the American Association of Public Opinion Research (AAPOR), The Washington-Baltimore Chapter of AAPOR (DC-AAPOR), and the American Statistical Association (ASA).

October 7, 2024
Workshop: 8:00 am – 5:00 pm

Samuel Riggs IV Alumni Center, Orem Alumni Hall

7801 Alumni Drive, University of Maryland, College Park, Maryland 20742

The workshop aims to strengthen the emerging connection between NLP and survey researchers. Survey researchers and data collectors are increasingly recognizing the need to integrate Large Language Models (LLMs) into their workflows. At the same time, NLP researchers see the importance of grounding their models with better data for accurate public representation as technology is becoming increasingly human-facing. This event seeks to foster joint work between the survey research and NLP communities, identify gaps and needs, and initiate a closer relationship between these interconnected fields.

Keynote Speakers

Barbara Plank

Professor and Co-director of the Center for Information and Language Processing at LMU Munich
Professor, IT University of Copenhagen
Vice-president elect, Association for Computational Linguistics (ACL)

Her research lab (Munich AI and NLP lab, MaiNLP) focuses on robust machine learning for Natural Language Processing with an emphasis on human-facing and data-centric approaches. Her research has been funded by distinguished grants, including an Amazon Research Award (2018), the Danish Research Council (Sapere Aude Research Leader Grant, 2020-2024), and the European Research Council (ERC Consolidator Grant, 2022-2027). Barbara is a Scholar of ELLIS (the European Laboratory for Learning and Intelligent Systems) and regularly serves on international committees. Currently, she is Vice President Elect of the Association for Computational Linguistics (ACL).

Frauke Kreuter

Co-Director, Social Data Science Center (SoDa)
Professor, Joint Program in Survey Methodology, University of Maryland
Chair of Statistics and Data Science in Social Sciences and the Humanities
Ludwig-Maximilians-University of Munich
President American Association for Public Opinion Research (AAPOR)

She is an elected fellow of the American Statistical Association and the 2020 recipient of the Warren Mitofsky Innovators Award of the American Association for Public Opinion Research. In addition to her academic work Dr. Kreuter is the Founder of the International Program for Survey and Data Science, developed in response to the increasing demand from researchers and practitioners for the appropriate methods and right tools to face a changing data environment; Co-Founder of the Coleridge Initiative, whose goal is to accelerate data-driven research and policy around human beings and their interactions for program management, policy development, and scholarly purposes by enabling efficient, effective, and secure access to sensitive data about society and the economy. coleridgeinitiative.org; and Co-Founder of the German language podcast Dig Deep.

Panelists / Moderators / Short Course Instructors

Please be sure to check back as the list will be updated as confirmations come in.

Trent Buskirk, Ph.D., Professor, Old Dominion University

Trent Buskirk

Professor, School of Data Science
Old Dominion University

Trent D. Buskirk, Ph.D. has recently joined the new School of Data Science at Old Dominion University as one of several founding faculty members. Prior to this appointment, Trent was the Novak Family Distinguished Professor of Data Science and outgoing Chair of the Applied Statistics and Operations Research Department at Bowling Green State University. Dr. Buskirk is a Fellow of the American Statistical Association and his research interests include big data quality, recruitment methods through social media, the use of big data and machine learning methods for health, social and survey science design and analysis, mobile and smartphone survey designs and in methods for calibrating and weighting non-probability samples and fairness in AI models and interpretable ML methods. When Trent is not geeking out over data science, big data or survey methodology, you can find him playing a competitive game of Pickleball!

Stephanie Eckman

Stephanie Eckman has a PhD in Methodology and Statistics from the University of Maryland. She has collected survey data around the world for government, nonprofits, and industry. Her current research interest is in applying the lessons from surveys to collect more accurate and efficient training data for AI/ML models. Her work on this topic has been published at EMNLP and ICML.

Anna-Carolina Haunch image

Anna-Carolina Haensch

Assistant Research Professor, Joint Program in Survey Methodology, University of Maryland
Senior Researcher, Institute for Statistics, Ludwig-Maximilians-University of Munich

Anna-Carolina (“Caro”) Haensch is an assistant research professor at the Joint Program in Survey Methodology at the University of Maryland and a Senior Researcher at the Institute for Statistics at LMU Munich, Germany. She is interested in Synthetic Data, Multiple Imputation, Statistics and Data Science training and enjoys teaching quantitative courses.

Tobias Holtdirk, Doctoral Student
Tobias Holtdirk

Doctoral Researcher
Leibniz Institute for the Social Sciences – Cologne

Tobias Holtdirk is a doctoral researcher in Computational Social Science department at GESIS – Leibniz Institute for the Social Sciences in Cologne. Prior to joining GESIS, he studied at RWTH Aachen University, where he graduated with a Bachelor’s degree in Computer Science and a Master’s degree in Computational Social Systems. His current research projects focus on applying Large Language Models (LLMs) to political research, including fine-tuning LLMs to predict voting behaviour based on survey data and conducting a LLM-based analysis of German parliamentary debates.

David Jurgens

Associate Professor in the School of Information
Associate Professor in the department of Computer Science and Engineering
University of Michigan

He obtained his PhD in Computer Science from the University of California, Los Angeles. His research centers on language technologies for social understanding and on behavioral analysis through language.

His work has been recognized by the Cozzarelli Prize from the National Academy of Science, Cialdini Prize from the Society for Personality and Social Psychology, multiple best paper awards and nominations (e.g,. ACL, ICWSM), and an NSF CAREER award.

Claire Kelley, Child Trends
Claire Kelley

Senior Data Scientist, Co-Director for Data Science
Child Trends

Claire Kelley is a senior data scientist and co-director for data science at Child Trends. Her work focuses on the applications of data science and AI techniques to social science problems: particularly those in the domains of education and health. Her current work includes use of AI to develop customized interactive tools, experiments with natural language processing approaches to qualitative data analysis and research on how AI impacts educators and students in K-12 classrooms.

Sarah Kelley, Child Trends
Sarah Kelley

Co-Director for Data Science
Child Trends

Sarah Kelley is co-director of data science at Child Trends. Sarah is a full stack data scientist with primary research interests in generative AI, natural language procession and precision social science. At Child Trends she conducts research focused on the wellbeing of children and youth through the lens of AI and data science techniques. She is currently working on a wide range of projects, including developing interactive AI powered explanations of risk scores in the juvenile justice system, conducing AI supported meta-analyses and experimenting with AI for text summarization at scale.

Max Melchior Lang

Ph.D. Student
University of Oxford

Max Lang is a PhD student in Population Health at the Big Data Institute, University of Oxford. His research focuses on modeling diseases in multimorbid settings and informing interventions in low-income countries. He also develops conversational AI for research and business applications, rethinking qualitative data collection through chat interfaces and interviewing agents that can conduct human-like interviews.

Joshua Lerner, Ph.D., Research Methodologist, NORC
Joshua Y. Lerner

Research Methodologist
NORC at the University of Chicago

Josh Lerner is a Research Methodologist at NORC at the University of Chicago, where he works on problems related to the intersection of AI, NLP, causal inference, and political/social science research. He is currently working on exploring the ways AI and NLP tools can be used to improve aspects of the research pipeline and how this intersects with both survey research and quantitative program evaluation.

Bolei Ma, PhD Student, LMU
Bolei Ma

Ph.D. Student
Ludwig-Maximilians-University of Munich

Bolei Ma is a PhD student at the Social Data Science and AI Lab at LMU Munich. He holds an M.A. in Linguistics and an M.Sc. in Computational Linguistics. His research interests focus on human-centric AI and the social implications of NLP technologies.

Philipp Mondorf, LMU Munich Ph.D. Student
Philipp Mondorf

Ph. D. Student
Ludwig-Maximilians-University of Munich

Philipp Mondorf is a PhD student at the Munich AI and NLP lab, focusing on reasoning with humans and large language models. His research lies at the intersection of artificial intelligence and cognitive science, with an emphasis on applying insights from human reasoning to enhance the development and understanding of LLM-based reasoning systems.

Vinod Prabhakaran, Staff Research Scientist, Google

Vinodkumar Prabhakaran

Staff Research Scientist
Responsible AI and Human Centered Technologies
Google

Dr. Vinodkumar Prabhakaran is a staff research scientist at Google’s Responsible AI and Human Centered Technologies organization, and co-lead the interdisciplinary Technology, AI, Society and Culture (TASC) team. Before Google, he was a postdoc at Stanford University, and obtained his PhD from Columbia University. His prior research focused on building scalable ways using language technologies to identify and address large-scale societal issues such as racial disparities in policing, workplace incivility, and online abuse. He has published over 50 articles in top-tier venues such as the PNAS, ACL, TACL, NAACL, EMNLP, and FAccT.

Stanley Presser, Distinguished University Professor, University of Maryland
Stanley Presser

Distinguished University Professor
Department of Sociology
Joint Program in Survey Methodology
University of Maryland

Stanley Presser is interested in the interface between social psychology and survey measurement. His research focuses on questionnaire design and testing, measurement error, survey nonresponse, and ethical issues stemming from the use of human subjects. His books include Questions and Answers in Attitude Surveys (with Howard Schuman), Survey Questions (with Jean Converse), and Methods for Testing and Evaluating Survey Questionnaires (chief editor). In addition to teaching in the Sociology Department, he teaches in the Joint Program in Survey Methodology, which he founded in 1992 with colleagues at the University of Michigan and Westat, Inc. He has served as editor of Public Opinion Quarterly, was president of the American Association for Public Opinion Research, and is an elected fellow of the American Statistical Association and of the American Association for the Advancement of Science. Presser was director of the Maryland Survey Research Center from 1989 to 2000.

Philip Resnik, UMD Power Professor

Philip Resnik

MPower Professor, Institute for Advanced Computer Studies and Department of Linguistics
University of Maryland

Philip Resnik is Professor at University of Maryland in the Department of Linguistics and the Institute for Advanced Computer Studies. He earned his bachelor’s in Computer Science at Harvard and his PhD in Computer and Information Science at the University of Pennsylvania, and does research in computational linguistics. Prior to joining UMD, he was an associate scientist at BBN, a graduate summer intern at IBM T.J. Watson Research Center (subsequently awarded an IBM Graduate Fellowship) while at UPenn, and a research scientist at Sun Microsystems Laboratories. In 2020 he was designated a Fellow of the Association for Computational Linguistics.

David Rothschild, Economist, Microsoft Research
David Rothschild

Economist
Microsoft Research

David Rothschild is an economist at Microsoft Research. He has a Ph.D. in applied economics from the Wharton School of Business at the University of Pennsylvania. He has written extensively, in both the academic and popular press. His work pushes the boundaries on varying data and methods: polling, prediction markets, social media and online data, and large behavioral and administrative data. His work focuses on solving practical and interesting questions including: mapping and updating public opinion, the market for news, effect of advertising, finance, and an economist take on public policy.

Ramya Vinayak, Assistant Professor, UW-Madison

Ramya Korlakai Vinayak

Assistant Professor, Department of ECE
Affiliated Faculty, Department of Computer Science and Department of Statistics
University of Wisconsin – Madison

Ramya Korlakai Vinayak is an assistant professor in the Dept. of ECE and affiliated faculty in the Dept. of Computer Science and the Dept. of Statistics at the UW-Madison. Her research interests span the areas of machine learning, statistical inference, and crowdsourcing, with a focus on preference learning & alignment under heterogeneity, reliable and efficient dataset creation, and human-in-the-loop systems. Her works aim to address theoretical and practical challenges that arise when learning from heterogeneous societal data. Prior to joining UW- Madison, Ramya was a postdoctoral researcher in the Paul G. Allen School of Computer Science and Engineering at the University of Washington. She received her Ph.D. in Electrical Engineering from Caltech. She obtained her Masters from Caltech and Bachelors from IIT Madras. She is a recipient of the Schlumberger Foundation Faculty of the Future fellowship from 2013-15, and an invited participant at the Rising Stars in EECS workshop in 2019. She is the recipient of NSF CAREER Award in 2023.

Tyler Waite, Advisory Data Scientist at IBM

Tyler Waite

Advisory Data Scientist
AI, Automation & Data Platform
IBM

Tyler Waite is an Advisory Data Scientist with the AI, Automation & Data Platform team at IBM. Tyler has a B.S. in Cognitive Psychology from the University of Utah, a M.A. in Human Factors from the University of Illinois and studied Information Science and UX Research (ABD) at Indiana University. Tyler’s main focus at IBM is exploring new ways to use Python for survey data analysis and creating dynamic interactive dashboards of the findings using IBM’s Cloud Pak for Data dashboarding tool.

Xinpeng Wang, Ph.D. Student, LMU Munich
Xinpeng Wang

Ph.D. Student
Ludwig-Maximilians-University of Munich

Xinpeng Wang is a PhD student at Munich AI and NLP (MaiNLP) lab, Ludwig-Maximilians-University of Munich. He obtained his M.Sc. in Robotics, Cognition, Intelligence at Technical University of Munich. His research focuses on Human-Centric NLP, Alignment and LLM Evaluation.

Yongwei Yang, Researcher, Google
Yongwei Yang

Researcher
Google

Yongwei Yang is a researcher at Google. His recent work and research involve user experience with Gemini, public opinions about AI, research methods and processes involving AI, and sentiment and behavioral signals for business insights. He also conducts foundational methodological research on surveys, psychological measurement, and behavioral signals. He holds a Ph.D. in Quantitative and Psychometric Methods from the University of Nebraska-Lincoln.