February 5th 2019
To highlight the people developing impactful projects on the PeakMetrics platform, we bring you the PeakMetrics Developer blog series.
Santoshi G, a graduate student at Columbia University studying Quantitative Methods in the Social Sciences, is a data enthusiast from Singapore. She is currently completing her MA after receiving a BA in Philosophy, Politics, and Economics from Oxford.
As part of her coursework, Santoshi built an R SDK for PeakMetrics’s News API. You can download it here: https://github.com/gsantoshi/civicfeedR. We spoke to Santoshi about her work, coursework, as well as her plans to work for the Singaporean government.
Nick: Tell us, what is Quantitative Methods in the Social Sciences?
Santoshi: It is a statistics and data science course with a focus on applications in social science research. Last semester, I took a class on accessing and wrangling data in R. For the final project of the class, we had to create and publish an R package.
I wanted to create an API package because I thought that would be a meaningful way to apply many of the skills I’d learnt in the class and to interact with an API I found interesting.
I did my undergrad in philosophy, politics and economics, and current affairs and the news have always been an important part of my life. So I was searching for news APIs when I found PeakMetrics’s API on ProgrammableWeb. I noticed that there wasn’t an R package for it, and started working on one, specifically for the news endpoint.
I started the project in mid-December and submitted it at the end of the semester, but continued working on it over vacation just to polish it up.
Nick: How did you approach deciding how to structure the package, which endpoints you wanted to expose?
Santoshi: I noticed that you had a legislator API, which I also found really cool but since I was primarily interested in API’s that provide news in a understandable format, I decided to focus on the News API first. I thought your API documentation made it pretty clear how the package would be structured.
I also integrated R specific functionality into my package. Besides basic functions that simplify the process of sending API requests, I built a number of functions in order for you to extract information and structure the data retrieved even more.
For someone who wants to get information from the API but who isn’t as familiar handling API’s as they are using R packages, this package insulates them from having to know what’s going on behind the scenes. It’s also designed to be self-contained.
Nick: Why R?
Santoshi: R was the first programming language I learned, as it’s very popular within the social science and statistics circles. More recently, I’ve also picked up python which is the language of choice for many data scientists but in terms of packages, R has a good ecosystem for visualization and data wrangling.
Nick: Can you tell us more about your coursework?
Santoshi: We have a number of core courses focused on statistics and social science research. These courses train us to formulate good social science research questions and back them up with a rigorous methodology.
For my electives last semester, I took classes in natural language processing and data wrangling. This semester, I’ll be taking classes in Bayesian statistics, machine learning, and data visualization.
In essence, the QMSS (Quantitative Methods in the Social Sciences at Columbia) program allows students to focus on data science as it relates to social science. Now that there are social media, it seems only appropriate that social scientists learn to harness that data instead of relying on old techniques.
Nick: What’s next for you?
Santoshi: I’m looking to go back to Singapore to work in the government. There’s a rising trend around the world to make data-driven policy. In Singapore, there’s a government agency called GovTech, which works on providing efficient digital services that change how the government has traditionally interacted with the public. There are also numerous efforts to make the city smarter. All of this generates data and makes it an invaluable resource for policymakers to exploit.
As for more short term plans, I’m in my last semester and plan to graduate in May. I’m currently looking for data-related internships in the US, particularly in the media and social good space, to pursue after graduation.
To learn more about Santoshi, visit her website here.. Thank you, Santoshi for taking the time to chat with us and growing our developer community!