An integrated approach to training data scientists – Stanford Report – Stanford University News

Every day, data scientists are analyzing vast amounts of information about the world, using computational methods to find new ways to understand a problem or phenomenon, and deciding what to do about it.

But its not enough to use data on its own it must be understood within its social and political context as well, according to Stanford political scientist Jeremy Weinstein. This year, Weinstein, along with Stanford statisticians Guenther Walther and Chiara Sabatti, has launched two new degrees: a Bachelor of Science in Data Science and a Bachelor of Arts in Data Science & Social Systems.

Jeremy Weinstein is the faculty director of the BA in Data Science and Social Systems and a professor of political science. Mallory Nobles, right, is the programs associate director. (Image credit: Andrew Brodhead)

Theres basically no new technological frontier that doesnt depend on or engage in some important way with human behavior or a political or social institution, explained Weinstein, a professor of political science in the School of Humanities and Sciences who serves as faculty director of the BA program in Data Science & Social Systems. For example, when staffing the tech industry of the future, you want people who can effortlessly move between the technical team, the policy team, and the trust and safety team. The Data Science and Social Systems program is designed to prepare professionals who can work at those intersections.

This past spring, over 90 students took the new gateway course for the major, DATASCI 154: Solving Social Problems with Data. Throughout the course, which Weinstein co-taught with Mallory Nobles, the programs associate director, students developed skills in quantitative analysis, modeling, and coding, but also honed their ability to frame problems, choose appropriate designs, and interpret data as it relates to its social and political environment.

The course brought two mindsets together: a social science approach, rooted in an understanding of causal inference, and an engineering approach, based in learning algorithmic design and optimization.

As Weinstein and Nobles emphasized to their students, these perspectives are interconnected.

When you ask and answer causal questions about a social problem, youre deepening your understanding of the underlying causes, which can give you clues about how you might go about solving it, and when you design an algorithmic solution, you then want to understand its effect when deployed in the world, which brings you back to causal inference, said Weinstein, who is also the faculty director of Stanford Impact Labs.

Students explored the value of these different approaches through modules designed with scholars from different fields at Stanford.

For example, Jennifer Pan in the Communication Department introduced students to the role of data science and causal inference techniques in studying the impact of social media on polarization and the spread of disinformation. Marshall Burke from the Department of Earth System Science engaged students in thinking through how machine learning approaches can help measure a changing climate, while social scientific methods are critical for understanding the impact of mitigation and adaptation policies. Ramesh Johari from the Department of Management Science and Engineering, along with David Scheinker from the School of Medicine, exposed students to the challenge of delivering equitable access to healthcare and how algorithmic approaches can improve delivery of patient care through the lens of their work on diabetes at Stanford Medicine.

Students learned how they too can be at ease shifting their perspective between engineering and social sciences. Class assignments emphasized statistics, computer science, and math in tandem with topics in the social and behavioral sciences, like psychology, sociology, economics, and political science. Their final project was to write a research proposal to tackle a social problem of their choosing.

As Josh Orszag learned, getting the data is the easy part. Data cant get you very far unless you have a meaningful research question.

Josh Orszag is a Data Science and Social Systems major who took Solving Social Problems with Data this past spring. (Image credit: Andrew Brodhead)

If you dont have the right research question, youre not going to get anywhere, said Orszag, a Data Science and Social Systems major interested in issues related to democracy and governance. The challenge is figuring out what problem or predicament you want your data to answer.

Orszag teamed up with Ava Kerkorian, a prospective Data Science and Social Systems major, to think about how to build trust in the election process.

Throughout their research design process, Kerkorian said she and Orszag went back and forth with each other as they figured out how such a complex issue could be tackled in a way that was specific, scalable, and also actionable.

So many times during this project, we had to take a pause and ask ourselves, how do we measure trust? What would success look like? What is confidence? Are we even sure this is something we want? Kerkorian said.

What they ended up proposing was studying whether a nudge a concept from behavioral economics that sways behavior through small suggestions or positive reinforcement explaining how redistricting works from an Independent Redistricting Commission could influence peoples attitudes about the fairness of an election.

The course made Serena Lee, also a data science and social systems major, think critically about what it means to be a responsible data scientist.

Serena Lee, who took Solving Social Problems with Data, presented her final project at a poster presentation at the end of the quarter.

This class taught me that the work starts with how to collect data because that has a lot of value-laden decisions, from whom to involve in the dataset to what questions to ask, what wording to use, and how far in the past to look at the data, Lee said. For her final project with Annie Zhu, they wanted to explore the influence of video-based misinformation in comparison to text-based misinformation. Specifically, they proposed studying different ways platforms could flag potentially harmful posts.

Eva Gorenburg, who also took the class this quarter, said learning the ins and outs of research design has changed how she now sees data.

I think its really easy to take numbers as objective fact, but what we learned is even in studies that seem super quantitative and objective, there are tons of choices in the study design that impact results, Gorenburg said.

Students also learned that what they choose to measure and not measure and how they use their data all have social consequences.

If you just rely on observational study, observation or opinion, therere so many essential experiences that youre leaving out, said Emily Winn, an environmental systems engineering major. Solving social problems with data allows us to see things on a much broader scale than previously before.

Winn and Gorenburg worked together for their final project, which was a proposal to study the social impacts of arsenic poisoning on women in Rural Bangladesh, where little data on its effects exists. Specifically, they wanted to know whether arsenicosis would lower a womans likelihood of marriage, which is essential for the economic and social security of women living in the region.

Understanding social problems is not the same as solving them.

Social problems exist for complex reasons, said Weinstein. Solving problems involves significant stakeholder consultation and understanding what the pathway is from a new insight or a new tool to actual change in the world.

Esha Thapa was one of over 90 students who took Solving Social Problems with Data this past spring. (Image credit: Andrew Brodhead)

For Esha Thapa, a Data Science and Social Systems major, the class marks the beginning of an interesting journey examining these dynamics in greater depth.

Its definitely not a process that ends with the quarter ending, said Thapa. Its something that we need to take with us for the rest of our careers and this is a great gateway course in that respect.

Following Solving Social Problems with Data, students in the major will continue to take a range of core classes in data science, ethics, and social sciences. In their senior year, students will take a capstone practicum where they will apply computational and statistical methods to address a social issue in a real-life setting.

Data Science majors can pursue one of two tracks: either a Bachelor of Science being overseen by Walther and Sabatti, or a Bachelor of Arts, which Weinstein and Nobles direct.

More:

An integrated approach to training data scientists - Stanford Report - Stanford University News

Related Posts

Comments are closed.