I am a Ph.D. student in Computer Science at MIT CSAIL, advised by Prof. David Karger and a member of the Haystack Group and UID Group at MIT.

I work on designing and building systems to improve discourse, collaboration, and understanding on the web, with applications to social media, news, political discourse, education, and civic engagement. I also conduct research on computational analysis of large-scale social data to understand topics in the social sciences. My general interests are social computing, HCI, and computational social science.

Prior to MIT, I worked as a software engineer, completed a Masters in CS as a Gates Scholar at Cambridge, and a Bachelor in CS from Rutgers.

My research is supported by a Google Fellowship and NSF Graduate Research Fellowship.


squadbox logo

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation

Communication platforms have struggled to provide effective tools for people facing harassment online. We conducted interviews with 18 recipients of online harassment to understand their strategies for coping, finding that they often resorted to asking friends for help. Inspired by these findings, we explore the feasibility of friendsourced moderation as a technique for combating online harassment. We present Squadbox, a tool to help recipients of email harassment coordinate a "squad" of friend moderators to shield and support them during attacks. Friend moderators intercept email from strangers and can reject, organize, and redirect emails, as well as collaborate on filters. Squadbox is designed to let its users implement highly customized workflows, as we found in interviews that harassment and preferences for mitigating it vary widely.

An Open Tool to Fight Harassment - Squadbox | An Open Project Spotlight by Mozilla Open Leaders

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation. Kaitlin Mahar, Amy X. Zhang, David Karger. CHI '18.
[pdf] [website] [github] [blog]

credco logo

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles

The proliferation of misinformation in online news and its amplification by platforms are a growing concern, leading to numerous efforts to improve the detection of and response to misinformation. Given the variety of approaches, collective agreement on the indicators that signify credible content could allow for greater collaboration and data-sharing across initiatives. In this paper, we present an initial set of indicators for article credibility defined by a diverse coalition of experts. These indicators originate from both within an article’s text as well as from external sources or article metadata. As a proof-of-concept, we present a dataset of 40 articles of varying credibility annotated with our indicators by 6 trained annotators using specialized platforms. We discuss future steps including expanding annotation, broadening the set of indicators, and considering their use by platforms and the public, towards the development of interoperable standards for content credibility.

The Credibility Coalition is working to establish the common elements of trustworthy articles by journalism.co.uk

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles. Amy X. Zhang, Aditya Ranganathan, Sarah Emlen Metz, Scott Appling, Connie Moon Sehat, Norman Gilmore, Nick B. Adams, Emmanuel Vincent, Jennifer 8. Lee, Martin Robbins, Ed Bice, Sandro Hawke, David Karger, and An Xiao Mina. WWW '18 Companion.
[pdf] [website]

tiis graph

Evaluation and Refinement of Clustered Search Results with the Crowd

When searching on the web or in an app, results are often returned as lists of hundreds to thousands of items, making it difficult for users to understand or navigate the space of results. Research has demonstrated that using clustering to partition search results into coherent, topical clusters can aid in both exploration and discovery. In this work, we investigate using crowd-based human evaluation to inspect, evaluate, and improve clusters to create high-quality clustered search results at scale. We introduce a workflow that begins by using a collection of well-known clustering algorithms to produce a set of clustered search results for a given query. Then, we use crowd workers to holistically assess the quality of each clustered search result in order to find the best one. Finally, the workflow has the crowd spot and fix problems in the best result in order to produce a final output. We evaluate this workflow on 120 top search queries from the Google Play Store, some of whom have clustered search results as a result of evaluations and refinements by experts.

Evaluation and Refinement of Clustered Search Results with the Crowd.
Amy X. Zhang, Wei Chai, Jinjun Xu, Lichan Hong, Ed Chi. ACM Transactions on Interactive Intelligent Systems: Special Issue on Human-Centered Machine Learning.
[to appear]


coarse discourse

Characterizing Online Discussion Using Coarse Discourse Sequences

We present a novel method for classifying comments in online discussions into a set of coarse discourse acts towards the goal of better understanding discussions at scale. We collect and release a corpus of over 9,000 threads comprising over 100,000 comments manually annotated via paid crowdsourcing with discourse acts and randomly sampled from the site Reddit. Using our corpus, we demonstrate how the analysis of discourse acts can characterize different types of discussions, including discourse sequences such as Q&A pairs and chains of disagreement, as well as different communities. Finally, we conduct experiments to predict discourse acts using our corpus, finding that structured prediction models such as conditional random fields can achieve an F1 score of 75%.

Coarse Discourse: A Dataset for Understanding Online Discussions by Google Research Blog

Characterizing Online Discussion Using Coarse Discourse Sequences. Amy X. Zhang, Bryan Culbertson, Praveen Paritosh. ICWSM '17.
[pdf] [slides and talk] [bibtex] [github]

emoji screenshot

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States

Determining affective states such as confusion from students' participation in online discussion forums can be useful for instructors of a large classroom. In this work, we harness affordances prevalent in social media to allow students to self-annotate their discussion posts with a set of hashtags and emojis, a process that is fast and cheap. From a dataset of over 25,000 discussion posts from two courses containing self-annotated posts by students, we demonstrate how we can identify linguistic differences between posts expressing confusion versus curiosity, achieving 83% accuracy at distinguishing between the two affective states.

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States. Amy X. Zhang, Michele Igo, Marc Facciotti, David Karger. Learning@Scale '17. Poster paper.
[pdf] [bibtex]


Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization

Large-scale discussions between many participants abound on the internet today, on topics ranging from political arguments to group coordination. But as these discussions grow to tens of thousands of posts, they become ever more difficult for a reader to digest. In this article, we describe a workflow called recursive summarization, implemented in our Wikum prototype, that enables a large population of readers or editors to work in small doses to refine out the main points of the discussion. More than just a single summary, our workflow produces a summary tree that enables a reader to explore distinct subtopics at multiple levels of detail based on their interests.

Cutting down the clutter in online conversations by MIT News
Presentation at Wikimedia Research Showcase [Youtube video]

Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization. Amy X. Zhang, Lea Verou, David Karger. CSCW '17.
[pdf] [slides and talk] [bibtex] [website]



Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML

Many people can author static web pages with HTML and CSS but find it hard or impossible to program persistent, interactive web applications. We show that for a broad class of CRUD (Create, Read, Update, Delete) applications, this gap can be bridged. Mavo extends the declarative syntax of HTML to describe Web applications that manage, store and transform data. Using Mavo, authors with basic HTML knowledge define complex data schemas implicitly as they design their HTML layout. They need only add a few attributes and expressions to their HTML elements to transform their static design into a persistent, data-driven web application whose data can be edited by direct manipulation of the content in the browser.

Introducing Mavo: Create Web Apps Entirely By Writing HTML! by Lea Verou - Smashing Magazine

Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML. Lea Verou, Amy X. Zhang, David Karger. UIST '16.
[pdf] [website]


Exploring Social Browsing

While the web contains many social websites, people are generally left in the dark about the activities of other people traversing the web as a whole. We explore the potential benefits and privacy considerations around generating a real-time, publicly accessible stream of web activity where users can publish chosen parts of their web browsing data. We also develop a new social media system for collecting, sharing, and visualizing aspects of one's browsing history.

Browsing in public by MIT News
MIT's Eyebrowse To Rank and Review Internet Sites, While Retaining Privacy by Slashdot
MIT proposes 'Eyebrowse' scheme to rank and review the entire internet by The Stack
Eyebrowse project lets users make web browsing history public and the accompanying radio segment by CBC News
Eyebrowse Aims to Socialize Your Web Surfing Experience by Inverse
MIT's Eyebrowse lets users make their browsing history public by ComputerWorld
System lets web users share aspects of their browsing history with friends, researchers by Phys.org

Opportunities and Challenges Around a Tool for Social and Public Web Activity Tracking. Amy X. Zhang, Joshua Blum, David Karger. CSCW '16.
[pdf] [slides and talk] [website] [github]

Eyebrowse: Selective and Public Web Activity Sharing. Amy X. Zhang, Joshua Blum, David Karger. Demo Paper. CSCW '16.
[demo pdf]

Reimagining Web Activity Tracking for Social Applications. Amy X. Zhang, Joshua Blum, David Karger. Workshop Paper. Everyday Surveillance Workshop @ CHI '16.
[workshop pdf] [Workshop website]


Gender and Ideology in the Spread of Anti-Abortion Policy

In the past few years an unprecedented wave of anti-abortion policies were introduced and enacted in state governments in the U.S., affecting millions of constituents. We study this rapid spread of policy change as a function of the underlying ideology of constituents. We examine over 200,000 public messages posted on Twitter surrounding abortion in the year 2013, a year that saw 82 new anti-abortion policies enacted. From these posts, we characterize people's expressions of opinion on abortion and show how these expressions align with policy change on these issues.

Gender and Ideology in the Spread of Anti-Abortion Policy. Amy X. Zhang, Scott Counts CHI '16.
[pdf] [slides and talk] [bibtex]


Conference Recommendation and Meetups

Confer is a tool for conference schedule organization and session/paper recommendation. Using the interface and the data, we are exploring ways to facilitate meetings, particularly between new and established members of research communities using this tool. We piloted a meetup session at CSCW '15 called "Confer Coffee", creating groups from people who liked similar papers on Confer, who then gathered in person at the conference, and we are interested in piloting more sessions in future conferences that use Confer.

Confer: A Conference Recommendation and Meetup Tool. Amy X. Zhang, David Karger, Anant Bhardwaj. Demo Paper. CSCW '16.
[demo pdf] [website] [github]


mailing lists

Reimagining the Mailing List

Mailing lists have existed since the early days of email and are still widely used today, even as more sophisticated online forums and social media websites proliferate. We explore why members prefer mailing lists to other group communication tools. But we also identify several tensions around mailing list usage that appear to contribute to dissatisfaction with them.

One way to reduce email stress: Re-invent the mailing list by MIT News
Hacker News

Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? Amy X. Zhang, Mark Ackerman, David Karger. CHI '15.
[pdf] [bibtex] [slides and talk] [video] [website] [github]


Modeling Ideology and Predicting Policy with Social Media

We study the many voices discussing an issue within a constituency and how they reflect ideology and may signal the outcome of important policy decisions. Focusing on the issue of same-sex marriage legalization, we examine almost 2 million public Twitter posts related to same-sex marriage in the U.S. states over the course of 4 years starting from 2011.

Modeling Ideology and Predicting Policy Change with Social Media: Case of Same-Sex Marriage. Amy X. Zhang, Scott Counts. CHI '15.
Best of CHI Honorable Mention.
[pdf] [bibtex] [slides and talk] [video]


Visualizing Text Corpora to Compare Media Frames

We develop a visualization technique and visual analytic system that enables the study of media frames across text corpora. In particular our system allows scholars or other analysts to compare media frames in a visualization called the Compare Cloud, which explicitly maps word prevalence and context information between two corpora.

Compare Clouds: Visualizing Text Corpora to Compare Media Frames. Nick Diakopolous, Dag Elgesem, Andrew Salway, Amy X. Zhang, Knuf Hofland. TextVis Workshop @ IUI '15.
[pdf] [bibtex]


Spreadsheet-backed Web Apps

We present a system for creating basic web applications using such spreadsheets in place of a server and using HTML to describe the client UI. Authors connect the two by placing spreadsheet references inside HTML attributes. Data computation is provided by spreadsheet formulas. The result is a reactive read-write-compute web page without a single line of Javascript code.

Cloudstitch: Beautiful Apps without the Programming Hassle by Rough Draft Ventures
Now a startup called Cloudstitch at http://www.cloudstitch.com

Spreadsheet Driven Web Applications. Edward Benson, Amy X. Zhang, David Karger. UIST '14.
[pdf] [bibtex] [video]

Kinetic Scrolling on Mobile and Tablets

To support navigation of long documents on touchscreen devices, we introduce content-aware kinetic scrolling, a novel scrolling technique that dynamically applies pseudo-haptic feedback in the form of friction around points of high interest within the page. This allows users to quickly find interesting content while exploring without further cluttering the limited visual space.

Content-Aware Kinetic Scrolling for Supporting Web Page Navigation. Juho Kim, Amy X. Zhang, Jihee Kim, Robert Miller, Krzysztof Gajos. UIST '14.
[pdf] [bibtex] [video]

Moral Framing in Climate Change Blog Discourse

In this work we develop a novel operationalization of moral evaluation frames and study their use within a corpus of blogs discussing climate change. We develop a text visualization tool called Lingoscope that allows the user to observe and filter the contextual terms that convey moral framing across large volumes of text, as well as to drill down to specific examples.

Identifying and Analyzing Moral Evaluation Frames in Climate Change Blog Discourse. Nick Diakopoulos, Amy X. Zhang, Dag Elgesem, Andrew Salway. ICWSM '14.
[pdf] [bibtex] [poster]

Controversy and Sentiment in Online News

In this work, we take a data-driven approach to understand how controversy interplays with emotional expression and biased language in the news. We begin by introducing a new dataset of controversial and noncontroversial terms collected using crowdsourcing. Then, focusing on 15 major U.S. news outlets, we compare millions of articles discussing controversial and non-controversial issues over a span of 7 months.

Why Media Bias Has Nowhere to Run and Hide from Data Science by CrowdFlower

Controversy and Sentiment in Online News. Yelena Mejova, Amy X. Zhang, Nicholas Diakopoulos, Carlos Castillo. Computation + Journalism '14.
[pdf] [bibtex]



Hoodsquare: Defining Neighborhoods

Information garnered from activity on location-based social networks can be harnessed to characterize urban spaces and organize them into neighborhoods. We adopt a data-driven approach to the identification and modeling of urban neighborhoods using location-based social networks.

Good Neighbours by Gates Cambridge

Hoodsquare: Modeling and Recommending Neighborhoods in Location-based Social Networks. Amy X. Zhang, Anastasios Noulas, Salvatore Scellato, and Cecilia Mascolo. SocialCom '13.
[pdf] [bibtex] [slides and talk] [website]


Visual Analytics of Media Frames

There is interest in trying to identify new frames around issues, and to compare how types of frames vary across different news outlets, or over time. We consider these analytic needs in the context of two use-cases relating to news producers and news consumers, and describe the initial design of a visual analytics tool, the LingoScope, in terms of how it supports these use-cases.

Visual Analytics of Media Frames in Online News and Blogs. Nick Diakopolous, Amy X. Zhang, and Andrew Salway. TextVis Workshop @ InfoVis '13.
[pdf] [bibtex]



Diurnal Urban Routines on Twitter

We study and characterize diurnal patterns in social media data for different urban areas, with the goal of providing context and framing for reasoning about such patterns at different scales. Using one of the largest datasets to date of Twitter content associated with different locations, we examine within-day variability and across-day variability of diurnal keyword patterns for different locations.

On the Study of Diurnal Urban Routines on Twitter. Mor Naaman, Amy X. Zhang, Samuel Brody, and Gilad Lotan. ICWSM '12.
[pdf] [bibtex]

Contact Info

Cambridge, MA 02139

Email: axz@mit.edu
Twitter: @amyxzh
Github: amyxzhang
Eyebrowse: amyxzhang

Latest News

May 2017: Summer internship at Microsoft Research with Justin Cranshaw and Andres Monroy Hernandez.

May 2016: Summer internship at Google Research with Praveen Paritosh.

Mar 2016: Recipient of a Google Research PhD Fellowship!

May 2015: Summer internship at Google Research with Jilin Chen and Ed Chi.

Mar 2015: Received CHI Best Paper Honorable Mention Award for paper on Twitter and policy change.

Jun 2014: Starting internship at MSR Redmond with Scott Counts.

Apr 2014: Recipient of the NSF Graduate Research Fellowship.

Feb 2014: Officially started PhD program at MIT!

Upcoming Travel

April 2018: Perugia, Italy, International Journalism Festival
April 2018: Montreal, Canada, CHI

Latest Blog Posts

Considering End Users in the Design of News Credibility Annotations - June 12, 2017
Year in Reviews - Dec 30, 2016
Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? - May 11, 2015
Thoughts and Reflections from a 1st Time Reviewer - March 19, 2014
What is a Neighborhood? - March 11, 2012

Tweets By @amyxzh

This site built by © Amy X. Zhang, github code here. Last updated May 2016.