Mining Social Media to Analyze Change
From COVID-19 to Black Lives Matter, McCourt researcher Lisa Singh discovers substance in bits of information.
Being on social media can feel like falling down a rabbit hole into a complex and confusing world. Lisa Singh, research professor in McCourt’s Massive Data Institute (MDI) and professor of computer science at Georgetown, thrives on making sense of that world by finding meaning in its millions of messages.
Singh, who specializes in what she calls “data-centric computing,” gathers and analyzes information posted online to help find solutions for public policy concerns. Her research work includes building methodologies and algorithms to mine data from the barrage of messages, photos, videos and links on Twitter, while also preserving privacy. She applies computer science methods to identify powerful signals from data that can be analyzed to understand major events, social movements and activism.
Singh’s recent research explores insights from social media about the Black Lives Matter and MeToo movements and the COVID-19 global pandemic. She also has studied forced migration in Iraq. She be- gan teaching at Georgetown in 2002, joined MDI in 2018 and is a co-founder of GU Women Coders and a member of the GU Taskforce for Gender Equity.
Much of her ongoing work draws on data gleaned from Twitter and analyzed through machine-learning computer techniques. It looks at tweets containing hashtags such as #BlackLivesMatter or #COVID19, as well as themes, frequently mentioned words, myths and the quality of linked domains. Such content offers a glimpse into what people are thinking and talking about — and how online interactions and information-sharing may change over time.
Singh readily points out that social media is not perfect for viewing the development of movements and impact of events. “It’s a biased lens in many ways and very noisy in all ways,” she says. “However, it al- lows us to understand attitudes, beliefs, behaviors and opinions in a way that we’ve not been able to in the past. That’s why it’s valuable, with the caveat that we know it’s not a representative sample.”
Evaluating COVID-19 Conversations
As the coronavirus pandemic grew in early 2020, Singh began studying how social media might reflect where the virus was appearing and the quality of information about it. She and a team of co-researchers from Georgetown, University of Michigan, University of Minnesota and Pennsylvania State University wanted to find out if online conversations and activity in geotagged locations could be used as a proxy for cases in those areas.
From mid-January to mid-April 2020, the project looked at COVID-19 messages on Twitter — more than 11 million tweets, 55 million retweets and 1.5 million quote tweets. It analyzed the volume and themes of discussions and myths or misinformation.
Findings from such research, says Singh, could help epidemiologists track the virus or assist public-health officials in understanding disease spread early during a pandemic when more reliable data are available. It also might assist public-health officials in understanding disease spread when exact case numbers are not known. Preliminary findings from the research showed that, in some countries, the volume of Twitter conversations about COVID-19 led new cases by two to five days.
By July 2020, the expanding project encompassed 100 million tweets and retweets, according to Singh. Now under study: more data variables and indirect indicators of virus movement.
Singh wants to broaden the information used from English-language tweets to those in native languages of other regions. “There are always parts of the world that don’t get the attention they need,” she says. “Developing these methodologies, even if they’re not applied extensively with coronavirus, might be our first-available avenue to information with the next virus.”
To explore COVID-19 misinformation, the researchers are looking at shared URL links to see if poor quality or fake information is creating a network of false ideas. “You hope that type of coordination isn’t emerging,” says Singh, “but with social media and an open internet, it makes it more plausible that it’s going to be there.”
Early findings from the research showed that while Twitter users share links to high-quality information sources slightly less than misinformation and fake news sources, a well connected network of low-quality COVID-19 related information is developing on the web, and both high-quality health and news sources are connecting to this community.
“There’s very little public policy around information quality issues related to social media,” Singh says. “The research can help understand how to identify biases and facts.”
Impact of #BlackLivesMatter
In May 2020, when George Floyd was killed by a Minneapolis police officer, online posts carrying #BlackLivesMatter or #BLM flooded Twitter with calls for racial justice and police accountability. Tweets quickly shared the video and news reports of Floyd’s death and the protests that followed in the U.S. and around the world.
At the same time, due to the COVID-19 pandemic, many people were at home and engaging more on social media. Discussions about racism and collective actions, such as participating in protests and fundraising for support groups, multiplied rapidly. Both social media and in-person social action became important for amplifying the Black Lives Matter movement.
“Historic change requires a really big push,” says Singh. “The online conversation isn’t happening in isolation. Things on the ground push the online movement and the online movement pushes things on the ground. That symbiosis makes the difference.”
Singh and her research team had been measuring Twitter activity about Black Lives Matter since early 2019 for a possible study comparing it to the MeToo movement. That gave them data to follow the skyrocketing volume of tweets after the video of Floyd’s death first appeared. Singh says social-media engagement about Black Lives Matter, racism and racial inequities remained higher than it had been before.
“It was obvious this was a moment where an online movement was helping to potentially propel change,” she says. “This is a moment we can’t let pass by.” Powered by its online and on-the-ground strengths, Singh notes, the Black Lives Matter movement could improve racial equity by influencing public policies and laws. That’s similar, she says, to how the MeToo movement led to more women running for political office in 2018 and winning.
Singh’s research continues on social media and Black Lives Matter, COVID-19, MeToo and more, including a gun-violence project.
“The common thread is how do we develop interpretable, reliable algorithms to gain insight into public opinion and human behavior,” she says. “That’s the computer science of what we do.”
This post originally appeared in Policy Perspectives, the annual alumni magazine from the McCourt School of Public Policy. Click here to view the full digital magazine.