Category: Discovery & Impact

Title: Data Essentials: Data Analysis in Python

Data Essentials: Data Analysis in Python classroom photo of Laila Wahedi and student

Python is a high-level and dynamic programming language that allows researchers to process large-scale data more efficiently than traditional statistical packages and provides libraries for accessing less structured data from the web.

 

The first workshop on October 17 was geared toward using different data manipulation libraries to load, store, filter, clean, and compute basic statistics. The second workshop, held on October 24, expanded on what was learned in the first, and taught attendees how to use APIs to collect data from the web and use these data in both a clustering and a regression analysis.

Laila Wahedi, MDI’s Postdoctoral Fellow, led these two workshops. Dr. Wahedi applies data analysis to her research on understanding networks of militant organizations, and the impact of these patterns of aligned groups. She received her Ph.D. in Political Science at Georgetown in 2017.

The workshops were well-received by our attendees. One student, Caitlin Karniski, a Ph.D. candidate in the Department of Biology, came to our workshops to expand her experience in Python. She commented that “the workshops were incredibly useful, not only in clearly demonstrating the syntax, but also the power of Python and what it can do.“ Karniski added that “It should also be noted that the majority of data workshops I have attended in the past have had so much time taken up by logistical snags – installing and loading software/libraries, synching code, etc. – but the workflow of the workshop (working simultaneously off of lecture slides and annotatable code notebooks) was clean and efficient!”

 

Another attendee, Matt Pearson, who is the Data Analytics Manager at the Graduate Career Center in the School of Foreign Service said that “McCourt’s MDI workshops present a unique opportunity to learn how to carry out the core building blocks of data science research and python-based data processing. I found it very valuable to walk through the process of exploring and cleaning a data set and producing an actionable visualization, as well as dip my feet in web scraping and API use. I feel much more confident in exploring these tools myself and developing my own python and data science abilities.”

One of the goals behind MDI’s data workshop series this semester is to offers students, staff, and faculty at Georgetown University the opportunity to expand their skill set in data research and data analysis and help them apply essential big data tools to their own research. These workshops are all free, and headed by the Institute’s experienced practitioners and collaborators.

Online versions of these two workshops on Data Analysis in Python will be available in the spring.