University of Maryland

Social Data Science and Collective Resilience: Opportunities and ChallengesSocial data science can both strengthen and damage the fabric of civil society.  Predictive models allow us to respond as situations develop and either manage or magnify them.  Social data enables quick responses to unrest and conflict, active engagement with developing narratives, and identification of individuals and […]

The 2021 Joint Program in Survey Methodology Distinguished Lecture

Friday, April 30,20211:00pm – 3:00pm EST  Data Defect Correlation:  A Unified Quality Metric for Probabilistic and Non-probabilistic Samples. Xiao-Li Meng, P.hD.Editor-in-ChiefWhipple V. N. Jones Professor of Statistics, and Founding Editor-in-Chief of Harvard Data Science Review Xiao-Li Meng, the Whipple V. N. Jones Professor of Statistics, and the Founding Editor-in-Chief of Harvard Data Science Review, is well known […]


SoDa Research Roundtables

Monday, March 22, 20211:00pm EST Talk 1: Offering Participant Control and Tailored Introductions in Smartphone-based Passive Data Collections to Avoid RefusalsPresented by doctoral student Alex Brown Breslin, JPSM  Talk 2: Trends in Trust: Measuring Americans’ Expectations for Trustworthy Research Use of Social Media DataPresented by Sarah Gilbert, Postdoc, iSchool. Upcoming Roundtables April 26, 1 pm Youngho Kim, […]

ASA DataFest 2021

Event Date and Time : Friday, April 9, 2021 – 12:00 am to Sunday, April 11, 2021 – 11:59 pm Location : Washington DC About DataFest The 2021 DC Datafest is a 48-hour competition where teams of undergraduate students from the DC metro area compete virtually to analyze rich, complex data sets from a major organization. Students are […]

How Do We Know That Our Statistical Methods Should Work? Benchmarks, Plasmodes, and Statistical Mediation Analysis

MONDAY SYMPOSIUM IN MEASUREMENT AND STATISTICS (MSMS)  UNIVERSITY OF MARYLAND together with  OHIO STATE UNIVERSITY, UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL* UNIVERSITY OF NOTRE DAME (* Organizer of this talk)   This presentation describes a benchmark method to validate statistical methods from the analysis of data on a known or established empirical effect. There are aspects to benchmark validation […]

Who Should Stop Unethical A.I.?

SoDa Associate and UMD Faculty Katie Shilton quoted in the New Yorker In computer science, the main outlets for peer-reviewed research are not journals but conferences, where accepted papers are presented in the form of talks or posters. In June, 2019, at a large artificial-intelligence conference in Long Beach, California, called Computer Vision and Pattern […]

Testing Remote Recruitment in a Television Measurement Panel (Feb 10, 2021)

Mixed-mode contact and recruitment approaches have become increasingly critical in reaching people to participate in surveys and research panels and improving their likelihood of response and often, the representation of the sample, bringing groups more traditionally represented. Nielsen has designed a phased sequential multi-mode methodology to recruit homes through a combination of web, phone, and in-person/proximity methods for its core TV panels that have relied exclusively on in-person recruitment. This work has been accelerated due to the COVID-19 pandemic. Nielsen will present the results of a “push to web” recruitment test conducted in five markets from October 2020 to January 2021.

The Nature and Impact of Hidden Data Errors on Information Risk and Data Science

Information Risk is an important field that encompasses multiple existing disciplines and overcomes the boundaries among them that has impeded knowledge sharing and effective management of integrated business projects. These projects often span multiple organizational groups and use different structured frameworks of information, data, computing, and security management. The practical realities intrinsic to performing this work leads to gaps in complying with regulations, ensuring secure operations, satisfying auditors, and even meeting program objectives for data analysis and business uses. This talk describes how deeply embedded data disparities that remain hidden to typical data methods lead to high error rates in project results. Lessons learned from assessing and correcting these situations is presented with examples of the problems and methods to detect and fix them.

Distance in Spatial Analysis: Challenges Related to Spatial Data Aggregation, Scale, and Computation

Distance is one of the most critical concepts in geography and has been widely used to quantify spatial separation between geographical entities. While measuring the distance between two points is straightforward, assessing the spatial separation between non-point objects can be challenging. This study investigates distance measurement between a location (point) and an area (polygon).

Bias Propensity to Inform Responsive and Adaptive Survey Design

Responsive and adaptive survey designs can be used to reduce the risk of nonresponse bias through data collection. In responsive design, different protocols that appeal to prior nonrespondents can be introduced in phases. In adaptive survey design, particular nonrespondents can be targets in these subsequent phases, based on predefined criteria. In the case of nonresponse bias, the criteria can be propensity models. Key, however, it the specification of these models.