You are here
By Dan Meisler
The power of big data in areas like business analytics and advertising is well established and burgeoning, but a growing movement within data science is exploring applications that benefit society as a whole — not just a single company or client.
In recent years, new data competitions, fellowships and academic programs have sprung up with the intent to pursue Data Science for Social Good (DSSG).
More than 100 researchers at the University of Michigan, in fields ranging from public health and biostatistics to social work and education, gathered this month to share their work in this area at the second-annual Data for Public Good Symposium.
“We tend to think of volunteering as something we do for a day, but this is a way someone can do something philanthropic for an extended period of time while using their expertise,” said Steve Salerno, U-M doctoral student in biostatistics and co-president of the student organization Statistics in the Community (STATCOM).
The research featured at the February 19 symposium is a necessary counterbalance to the power big data can provide those with the resources to exploit it, said H.V. Jagadish, the Bernard A. Galler Collegiate Professor of Electrical Engineering and Computer Science and newly appointed director of the Michigan Institute for Data Science (MIDAS).
“A major concern in a data-driven world is information asymmetry,” Jagadish said. “The underprivileged individual has very little information as compared to the people, companies and agencies they interact with on a regular basis. The value of information grows more than linearly with size, so it behooves us as data scientists to help ordinary citizens and the less fortunate among us. I am proud of the work our researchers are doing in this regard.”
Below are a few examples of the projects presented at the symposium, which was sponsored by by STATCOM, MIDAS, the Center for Education Design, Evaluation, and Research, and the Community Technical Assistance Collaborative.
Partners for Preschool
Using data from Boston Public Schools (BPS), researchers from the U-M School of Education, Harvard Graduate School of Education and the nonprofit MDRC assessed whether parents’ at-home math or language learning activities with their preschool children translate to educational gains during the first year of preschool.
The sample included more than 300 children who participated in a free preschool program, with parents completing a survey on literacy, language and math practices. Trained research staff also assessed the children’s language and math skills.
Researchers then used multilevel models to measure the relationship between at-home instruction and children’s language and math gains in preschool, controlling for key variables like pre-test scores. The study also assessed whether parents were more likely to engage in learning activities that support unconstrained skills, such as vocabulary and problem solving, or constrained skills like letter knowledge and counting, and which type of activity more strongly predicts children’s learning gains.
The analysis found that unconstrained math activities were practiced by parents least often, while unconstrained language activities were practiced by parents most often. The study also found that unconstrained language and math activities were both predictive of educational gains in the first year of preschool, while constrained activities were not.
- Project Title: Partners for Preschool: The Added Value of Learning Activities at Home During the Preschool Year
- Research Team: Meghan McCormick, MDRC; Amanda Ketner, School of Education; Christina Weiland, School of Education; JoAnn Hsueh, MDRC; Catherine Snow, Harvard Graduate School of Education; Jason Sachs, Boston Public Schools
Networks of Influence
Using theories and tools from graph theory, a research team from the School of Information and Ross School of Business evaluated the relationships between U.S. members of Congress and the people they retweeted on Twitter.
Using data from 2017, the group analyzed retweets by Congress members, building in measures that detect Twitter accounts’ influence on each other’s relationships and behaviors.
The study discovered two non-overlapping communities of Congress members retweeting each other that nearly precisely map on to partisan divisions in Congress (i.e., one community is comprised principally of Democrats and the other of Republicans). With a contemporary dataset, this work tests and validates previous studies that found similar partisan communities with political blogosphere data.
Using natural language processing techniques, the research group also found the non-Congressional accounts that members of Congress most often retweeted fall principally into the categories of local news organizations, local public service organizations and law enforcement. Each of these groups, researchers said, may tell us something about narrative influence on members of Congress’ public personae, or how members use Twitter to craft aspects of their public personae.
Such findings are important to consider, researchers said, in today’s political climate, in which questions about the way our democracy functions and who has power to shape political discourse are vital.
- Project Title: Understanding Networks of Influence on U.S. Congressional Members’ Public Personae on Twitter
- Research Team: Angela Schöpke, School of Information; Chris Bredernitz, School of Information; Caroline Hodge, Ross School of Business and School of Information
Reaching the Food-Insecure
Food For Thought, a nonprofit in Toledo that operates several mobile food pantries, recently partnered with U-M researchers from Statistics in the Community to optimize the location of its facilities throughout the city.
With the goal of finding the most efficient placement of food pantries, U-M researchers analyzed the locations of pantries to identify ways to serve those most in need (as determined by poverty rate, public benefit eligibility rate and incidence of disease related to food scarcity). Increasing the number of families served, and decreasing travel times to and from the mobile pantries also were key objectives.
Researchers used an optimization model using datasets that included household locations, pantry locations and pantry visits per day.
Their findings led to recommendations for better food pantry locations across the city, as well as a 10-day schedule for the placements.
- Project Title: Optimization of Food Pantry Locations to Address Food Scarcity in Toledo
- Research Team: Sharanya Chandran, Biostatistics; Jie Ma, Biostatistics; Emily Morris, Biostatistics; Jeremy Pasteris, Biostatistics; Ece Sanci, Industrial and Operations Engineering; (Faculty Adviser: Veronica Berrocal, Associate Professor, Biostatistics)