Discovery & Impact

#METOO Movement Twitter Data Mined by Computer Science Professor

September 10, 2018
Prof. Singh

Lisa Singh,computer science professor and research professor in the McCourt School of Public Policy’s Massive Data Institute, works on numerous aspects of data-driven computer science. One of her most recent projects is mining Twitter data around the #MeToo hashtag in collaboration with six other Georgetown professors. This research is supported by the Massive Data Institute and the Gender Justice Initiative. The #MeToo movement against sexual harassment and assault, particularly in the workplace, went viral last October.

First thoughts

“I have a really strong interest in understanding the strengths and limitations of using social media data within traditional computer and social science research. I think it can be used to impact public policy and societal scale issues if we can harness it in a smart way while respecting ethical and privacy considerations.”

Why study #MeToo?

“There’s a legitimate interest in seeing how we can rectify the inequality that exists in the workplace and prevent abuse from happening in the first place. I think this is the time to engage in that conversation with academics across different fields.”

Other faculty collaborators

Linguistics professor Deborah Tannen
Anthropology professor and chair Denise Brennan
Law professors Naomi Mezey, Jamillah Williams,Nan Hunter and Deborah Epstein

How do you mine data?

“Mining data means that you are looking for patterns in the data to better characterize or describe a behavior or condition. For our data, we develop algorithms or use algorithms to determine the main topic of the text, the tone of text, the stance or position people have in the text (are they for or against an issue), etc. We also use a mix of data mining and machine learning to learn how to infer race, occupation, gender or other demographics of individuals who are tweeting about the movement.”

Early findings

Number of #MeToo Tweets to date: More than 8.1 million
The most prevalent topics of conversation have been a) the movement/activism, b) sexual abuse and assault, c) harassment and d) politics.
Some top associated hashtags: #TimesUp, #WithYou, #Resist and other political hashtags
Over 100 occupations are mentioned at least 100 times. The professions relate to all walks of life, including professors and students.

General tone of conversations

More negative than positive.

Moment garnering most positive social activity

Dec. 6, 2017, when Time magazine named “The Silence Breakers,” or those who speak out against sexual assault, as its “Person of the Year.”

What else does the team want to know?

The research team is developing algorithms to more accurately determine some demographics of individuals using the hashtag. This will allow the team to better understand this population and what different segments of it are conversing about. Then the team will then begin to investigate more complex questions such as:

Whether there are particularly vocal or silent communities in the discussions and why
How the #MeToo movement may be utilized differently by women of color
What the level of discussion is around work-related posts versus domestic/home-related posts for different demographic groups
Whether this movement is a leading indicator of societal change, such as reporting claims and changes in gender roles.

“We will also combine our research findings with data from other sources, including the Equal Employment Opportunity Commission (EEOC), Pew Research surveys, and different NGOs to improve our understanding of how the landscape varies across race, gender and occupation.”

Policy applications

Singh and her collaborators think it’s likely that many of the people who talk about harassment and bullying online may not always go to get help or go to the EEOC to report the situation.

“What we want to do is to understand whether or not what we see on social media gives agencies and others looking at these types of issues more insight and information than the data they collect. We may find that there is a vulnerable population or industry that one of these agencies does not have enough insight into, but that the agencies need to consider when developing policy.”

Future of movement

“The conversation is heavily influenced by daily and weekly events. This is one of the reasons I believe the movement continues to be large events related to it are always happening and the public, including well-known individuals, continue to comment on them.”

Singh’s other activities

Co-author of the upcoming book,Words that Matter: How the News and Social Media Shaped the 2016 Presidential Campaign (Brookings Institution Press, 2019)
Studying privacy on the web
Using big data to understand forced migration
Analyzing animal social structures
Faculty advisor, Georgetown Women Coders