April 25 – SoDa Seed Grant Series: How Can Large Language Models Help Us Identify and Use Constructs that We Can Trust?

Date: Thursday, April 25, 2024 Time: 12:00pm – 1:00pm Location: Zoom

Abstract:

The idea of a construct is central in the psychological and social sciences: constructs are abstract categories like empathy, misinformation, or benefits of social interaction that are operationalized in order to make them measurable. Social scientists spend an enormous amount of time and care in thinking about constructs; they are the vocabulary over which theories are defined. This stands in striking contrast to a great deal of current computational research in AI and NLP, where theory tends to be secondary and most work utilizes directly observable behaviors, simplistic operational definitions, or intuitive but poorly-defined proxy relationships. The gap between social science needs and computational practice is deeply problematic: computer scientists often blast forward with highly scalable methods without establishing careful connections to real-world problems and research questions, while social scientists often expend effort on traditional manual analysis out of a distrust for automated solutions or, conversely, they use the output of computational systems uncritically as if what they produce is known to be correct.

This research project is bridging that gap by advancing a recent AI approach called few-shot learning, bringing together cutting-edge machine learning with human expertise in order to identify and work with constructs in a way that is both efficient and trustworthy. Paired with that methodological goal is the use of these methods to support substantive research in computational political science, focused on the measurement of politicized topics, and in mental health, focused on measurement of constructs related to suicidal risk.

SoDa Seed Grant Award Recipient: Philip Resnik, Ph.D. MPower Professor, Institute for Advanced Computer Studies and Department of Linguistics, University of Maryland

Philip Resnik is Professor at University of Maryland in the Department of Linguistics and the Institute for Advanced Computer Studies. He earned his bachelor’s in Computer Science at Harvard and his PhD in Computer and Information Science at the University of Pennsylvania, and does research in computational linguistics. Prior to joining UMD, he was an associate scientist at BBN, a graduate summer intern at IBM T.J. Watson Research Center (subsequently awarded an IBM Graduate Fellowship) while at UPenn, and a research scientist at Sun Microsystems Laboratories. In 2020 he was designated a Fellow of the Association for Computational Linguistics. Philip’s most recent research has focused in two areas. One is the computational cognitive neuroscience of language, where he has been using computational modeling in connection with brain imaging to look at the role of context and predictive processing during online language comprehension. The other is computational social science, with an emphasis on connecting the signal available in people’s language use with underlying mental state — this has applications in computational political science, particularly in connection with ideology and framing, and in mental health, focusing on the ways that linguistic behavior may help to identify and monitor depression, schizophrenia, and suicidality. Philip is a scientific advisor for two non-profits: the Coleridge Initiative (which supports governmental organizations in using data effectively for public decision-making) and NORC at the University of Chicago (a non-partisan, independent social research organization). In entrepreneurial life he was a technical co-founder of CodeRyte (NLP for electronic health records, acquired by 3M in 2012), and is an advisor to FiscalNote (machine learning and analytics for government relations, went public in 2022) and Trustible (a startup supporting compliance with AI regulations).

Alexander Hoyle PhD Candidate, Computational Linguistics and Information Processing Lab, University of Maryland

Alexander Hoyle is a fifth-year PhD student working at the University of Maryland’s Computational Linguistics and Information Processing lab, advised by Philip Resnik. Hoyle’s research is oriented around the development and evaluation of methods for computational social science. Previous internships were held in the FATE group at Microsoft Research and AllenNLP at AI2. Hoyle received a master’s in computational statistics and machine learning at University College London. Prior to graduate school, Hoyle was a Research Analyst at The Brattle Group in Cambridge, Massachusetts, and helped conduct research on New York City public housing for the U.S. Department of Justice; these efforts eventually led to a $2.2 billion settlement to improve conditions.

Industry Expert Commentator: Andrew Stavisky, Ph.D. Assistant Director U.S. Government Accountability Office (GAO)

Andrew Stavisky, PhD is currently an Assistant Director in the Applied Research and Methods group at the US Government Accountability Office (GAO) – the non-partisan, independent investigative arm of Congress, where advises on policy research and leads GAO’s qualitative research practice. His policy areas focus on national security, cybersecurity, telecommunications, and emerging technology. He is one of the agency’s leaders on AI policy. Andrew collaborates with GAO’s innovation lab on developing use cases for its internal Generative AI and Topic Modeling platforms. He has co-authored dozens of emerging technology and AI related Congressional Reports and Technology Assessments. Additionally, he has moderated joint GAO/National Academy of Science expert policy panels (2–3-day sessions of 15-20+ experts) on emerging technology topics such as quantum computing, fusion energy, the use of AI in drug development, the use of AI in enhancing climate disaster modeling, and most recently on the human and environmental impacts of Generative AI. Andrew writes and speaks about the nexus of AI, policy, and qualitative research and the importance of humans within the AI process. He was a 2023 Wilson Center Congressional AI Fellow. Andrew was formerly the Founder and CEO of Thematically, a text analytics company that developed an AI driven human-in-the-loop SaaS platform (pre-ChatGPT) that quickly and efficiently turns unstructured text into meaningful thematic categories. Thematically was acquired by SoloSegment in February 2020.

SoDa Seed Grants: The projects under this initiative may address any societal challenge that affects a large number of people, including but not limited to health, public safety, justice, race, gender, education, employment, transit, and political representation. The goal of these seed grants is to encourage faculty to develop collaborative projects that stimulate the advancement of new ideas that can build the university’s expertise toward a national reputation in the broad area of social data science. The projects blend the development or use of innovative data science methods or new measurements, the advancement of scholarship within or across disciplines, and progress in addressing a societal challenge.