I am a Ph.D. student in Computer Science at MIT CSAIL. I am a member of the Haystack Group at MIT and a 2018-19 Fellow at the Berkman Klein Center at Harvard.

I work on designing and building systems to improve discourse, collaboration, and understanding on the web, with applications to social media, news, political discourse, education, and civic engagement. I also conduct research on computational analysis of large-scale social data to understand topics in the social sciences. My general interests are social computing, CSCW, HCI, and computational social science.

Prior to MIT, I worked as a software engineer, completed a Masters in CS as a Gates Scholar at Cambridge, and a Bachelor in CS from Rutgers.

My research is supported by a Google Fellowship and NSF Graduate Research Fellowship.


tiis graph

Making Sense of Group Chat through Collaborative Tagging and Summarization

While group chat is becoming increasingly popular for team collaboration, these systems generate long streams of unstructured back-and-forth discussion that are difficult to comprehend. In this work, we investigate ways to enrich the representation of chat conversations, using techniques such as tagging and summarization, to enable users to better make sense of chat. Through needfinding interviews with 15 active group chat users, who were shown mock-up alternative chat designs, we found the importance of structured representations, including signals such as discourse acts. We then developed Tilda, a prototype system that enables people to collaboratively enrich their chat conversation while conversing. From lab evaluations, we examined the ease of marking up chat using Tilda as well as the effectiveness of Tilda-enabled summaries for getting an overview. From a field deployment, we found that teams actively engaged with Tilda both for marking up their chat as well as catching up on chat.

Making Sense of Group Chat through Collaborative Tagging and Summarization
Amy X. Zhang, Justin Cranshaw CSCW '18.
Best Paper Award.

tiis graph

Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments

Resolving disputes in a timely manner is crucial for any online production group. We present an analysis of Requests for Comments (RfCs), one of the main vehicles on Wikipedia for formally resolving a policy or content dispute. We collected an exhaustive dataset of 7,316 RfCs on English Wikipedia over the course of 7 years and conducted a qualitative and quantitative analysis into what issues affect the RfC process. Our analysis was informed by 10 interviews with frequent RfC closers. We found that a major issue affecting the RfC process is the prevalence of RfCs that could have benefited from formal closure but that linger indefinitely without one, with factors including participants' interest and expertise impacting the likelihood of resolution. From these findings, we developed a model that predicts whether an RfC will go stale with 75.3% accuracy, a level that is approached as early as one week after dispute initiation.

Presentation at Wikimedia Research Showcase [Youtube video]

Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments
Jane Im, Amy X. Zhang, Christopher J. Schilling, David Karger CSCW '18.

squadbox logo

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation

Communication platforms have struggled to provide effective tools for people facing harassment online. We conducted interviews with 18 recipients of online harassment to understand their strategies for coping, finding that they often resorted to asking friends for help. Inspired by these findings, we explore the feasibility of friendsourced moderation as a technique for combating online harassment. We present Squadbox, a tool to help recipients of email harassment coordinate a "squad" of friend moderators to shield and support them during attacks. Friend moderators intercept email from strangers and can reject, organize, and redirect emails, as well as collaborate on filters. Squadbox is designed to let its users implement highly customized workflows, as we found in interviews that harassment and preferences for mitigating it vary widely.

New software by MIT, dubbed 'Squadbox,' hopes to combat cyberbullying by ABC News
MIT researchers created a new tool that lets your 'squad' combat online harassment by Business Insider
A New Tool For Fighting Online Abuse Lets You Recruit Friends For Help by Refinery29
MIT has a new tool to combat online harassment: your friends by The Verge
MIT Tool Lets Your Friends Help You Fight Email Harassment by PCMag
How you and your friends can fight back against online trolls by New Scientist
This MIT Tool Enlists Your Squad To Stop Toxic Internet Harassers by Fast Co.Design
With Squadbox, friends moderate harassing messages in your email by Engadget
Recruit Your Friends to Stop Online Harassment by LifeHacker
MIT creates tool to help curb cyber bullying by Channel 7 News Boston
MIT researchers have developed a tool to fight cyberbullying by The Daily Dot
MIT researchers aim to tackle cyberbullying with Squadbox - this is how it works by Qrius
Can 'friendsourcing' save us from online harassment? by The National Student
An Open Tool to Fight Harassment - Squadbox | An Open Project Spotlight by Mozilla Open Leaders

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation. Kaitlin Mahar, Amy X. Zhang, David Karger. CHI '18.
[pdf] [bibtex] [website] [github] [blog] [slides]

Squadbox: A Tool to Combat Online Harassment Using Friendsourced Moderation. Kaitlin Mahar, Amy X. Zhang, David Karger.
Demo Paper CHI '18.
[demo pdf]

credco logo

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles

The proliferation of misinformation in online news and its amplification by platforms are a growing concern, leading to numerous efforts to improve the detection of and response to misinformation. Given the variety of approaches, collective agreement on the indicators that signify credible content could allow for greater collaboration and data-sharing across initiatives. In this paper, we present an initial set of indicators for article credibility defined by a diverse coalition of experts. These indicators originate from both within an article's text as well as from external sources or article metadata. As a proof-of-concept, we present a dataset of 40 articles of varying credibility annotated with our indicators by 6 trained annotators using specialized platforms. We discuss future steps including expanding annotation, broadening the set of indicators, and considering their use by platforms and the public, towards the development of interoperable standards for content credibility.

Creating a Circle of Trust by Gates Cambridge
These academics are on the frontlines of fake news research by Poynter
Elevating quality journalism on the open web by Google News Initiative
How The Credibility Coalition Determines Trust Indicators by news:rewired
The Credibility Coalition is working to establish the common elements of trustworthy articles by journalism.co.uk

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles. Amy X. Zhang, Aditya Ranganathan, Sarah Emlen Metz, Scott Appling, Connie Moon Sehat, Norman Gilmore, Nick B. Adams, Emmanuel Vincent, Jennifer 8. Lee, Martin Robbins, Ed Bice, Sandro Hawke, David Karger, and An Xiao Mina. WWW '18 Companion.
[pdf] [bibtex] [website]

tiis graph

Evaluation and Refinement of Clustered Search Results with the Crowd

When searching on the web or in an app, results are often returned as lists of hundreds to thousands of items, making it difficult for users to understand or navigate the space of results. Research has demonstrated that using clustering to partition search results into coherent, topical clusters can aid in both exploration and discovery. In this work, we investigate using crowd-based human evaluation to inspect, evaluate, and improve clusters to create high-quality clustered search results at scale. We introduce a workflow that begins by using a collection of well-known clustering algorithms to produce a set of clustered search results for a given query. Then, we use crowd workers to holistically assess the quality of each clustered search result in order to find the best one. Finally, the workflow has the crowd spot and fix problems in the best result in order to produce a final output. We evaluate this workflow on 120 top search queries from the Google Play Store, some of whom have clustered search results as a result of evaluations and refinements by experts.

Evaluation and Refinement of Clustered Search Results with the Crowd.
Amy X. Zhang, Wei Chai, Jinjun Xu, Lichan Hong, Ed Chi. ACM Transactions on Interactive Intelligent Systems: Special Issue on Human-Centered Machine Learning. 8, 2, Article 14 (June 2018), 28 pages.
[pdf] [bibtex]


coarse discourse

Characterizing Online Discussion Using Coarse Discourse Sequences

We present a novel method for classifying comments in online discussions into a set of coarse discourse acts towards the goal of better understanding discussions at scale. We collect and release a corpus of over 9,000 threads comprising over 100,000 comments manually annotated via paid crowdsourcing with discourse acts and randomly sampled from the site Reddit. Using our corpus, we demonstrate how the analysis of discourse acts can characterize different types of discussions, including discourse sequences such as Q&A pairs and chains of disagreement, as well as different communities. Finally, we conduct experiments to predict discourse acts using our corpus, finding that structured prediction models such as conditional random fields can achieve an F1 score of 75%.

Coarse Discourse: A Dataset for Understanding Online Discussions by Google Research Blog

Characterizing Online Discussion Using Coarse Discourse Sequences. Amy X. Zhang, Bryan Culbertson, Praveen Paritosh. ICWSM '17.
[pdf] [slides and talk] [bibtex] [github]

emoji screenshot

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States

Determining affective states such as confusion from students' participation in online discussion forums can be useful for instructors of a large classroom. In this work, we harness affordances prevalent in social media to allow students to self-annotate their discussion posts with a set of hashtags and emojis, a process that is fast and cheap. From a dataset of over 25,000 discussion posts from two courses containing self-annotated posts by students, we demonstrate how we can identify linguistic differences between posts expressing confusion versus curiosity, achieving 83% accuracy at distinguishing between the two affective states.

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States. Amy X. Zhang, Michele Igo, Marc Facciotti, David Karger. Learning@Scale '17. Poster paper.
[pdf] [bibtex]


Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization

Large-scale discussions between many participants abound on the internet today, on topics ranging from political arguments to group coordination. But as these discussions grow to tens of thousands of posts, they become ever more difficult for a reader to digest. In this article, we describe a workflow called recursive summarization, implemented in our Wikum prototype, that enables a large population of readers or editors to work in small doses to refine out the main points of the discussion. More than just a single summary, our workflow produces a summary tree that enables a reader to explore distinct subtopics at multiple levels of detail based on their interests.

Cutting down the clutter in online conversations by MIT News
Presentation at Wikimedia Research Showcase [Youtube video]

Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization. Amy X. Zhang, Lea Verou, David Karger. CSCW '17.
[pdf] [slides and talk] [bibtex] [website]



Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML

Many people can author static web pages with HTML and CSS but find it hard or impossible to program persistent, interactive web applications. We show that for a broad class of CRUD (Create, Read, Update, Delete) applications, this gap can be bridged. Mavo extends the declarative syntax of HTML to describe Web applications that manage, store and transform data. Using Mavo, authors with basic HTML knowledge define complex data schemas implicitly as they design their HTML layout. They need only add a few attributes and expressions to their HTML elements to transform their static design into a persistent, data-driven web application whose data can be edited by direct manipulation of the content in the browser.

Introducing Mavo: Create Web Apps Entirely By Writing HTML! by Lea Verou - Smashing Magazine

Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML. Lea Verou, Amy X. Zhang, David Karger. UIST '16.
[pdf] [website]


Exploring Social Browsing

While the web contains many social websites, people are generally left in the dark about the activities of other people traversing the web as a whole. We explore the potential benefits and privacy considerations around generating a real-time, publicly accessible stream of web activity where users can publish chosen parts of their web browsing data. We also develop a new social media system for collecting, sharing, and visualizing aspects of one's browsing history.

Browsing in public by MIT News
MIT's Eyebrowse To Rank and Review Internet Sites, While Retaining Privacy by Slashdot
MIT proposes 'Eyebrowse' scheme to rank and review the entire internet by The Stack
Eyebrowse project lets users make web browsing history public and the accompanying radio segment by CBC News
Eyebrowse Aims to Socialize Your Web Surfing Experience by Inverse
MIT's Eyebrowse lets users make their browsing history public by ComputerWorld
System lets web users share aspects of their browsing history with friends, researchers by Phys.org

Opportunities and Challenges Around a Tool for Social and Public Web Activity Tracking. Amy X. Zhang, Joshua Blum, David Karger. CSCW '16.
[pdf] [slides and talk] [website] [github]

Eyebrowse: Selective and Public Web Activity Sharing. Amy X. Zhang, Joshua Blum, David Karger. Demo Paper. CSCW '16.
[demo pdf]

Reimagining Web Activity Tracking for Social Applications. Amy X. Zhang, Joshua Blum, David Karger. Workshop Paper. Everyday Surveillance Workshop @ CHI '16.
[workshop pdf] [Workshop website]


Gender and Ideology in the Spread of Anti-Abortion Policy

In the past few years an unprecedented wave of anti-abortion policies were introduced and enacted in state governments in the U.S., affecting millions of constituents. We study this rapid spread of policy change as a function of the underlying ideology of constituents. We examine over 200,000 public messages posted on Twitter surrounding abortion in the year 2013, a year that saw 82 new anti-abortion policies enacted. From these posts, we characterize people's expressions of opinion on abortion and show how these expressions align with policy change on these issues.

Gender and Ideology in the Spread of Anti-Abortion Policy. Amy X. Zhang, Scott Counts CHI '16.
[pdf] [slides and talk] [bibtex]


Conference Recommendation and Meetups

Confer is a tool for conference schedule organization and session/paper recommendation. Using the interface and the data, we are exploring ways to facilitate meetings, particularly between new and established members of research communities using this tool. We piloted a meetup session at CSCW '15 called "Confer Coffee", creating groups from people who liked similar papers on Confer, who then gathered in person at the conference, and we are interested in piloting more sessions in future conferences that use Confer.

Confer: A Conference Recommendation and Meetup Tool. Amy X. Zhang, David Karger, Anant Bhardwaj. Demo Paper. CSCW '16.
[demo pdf] [website] [github]


mailing lists

Reimagining the Mailing List

Mailing lists have existed since the early days of email and are still widely used today, even as more sophisticated online forums and social media websites proliferate. We explore why members prefer mailing lists to other group communication tools. But we also identify several tensions around mailing list usage that appear to contribute to dissatisfaction with them.

One way to reduce email stress: Re-invent the mailing list by MIT News
Hacker News

Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? Amy X. Zhang, Mark Ackerman, David Karger. CHI '15.
[pdf] [bibtex] [slides and talk] [video] [website] [github]


Modeling Ideology and Predicting Policy with Social Media

We study the many voices discussing an issue within a constituency and how they reflect ideology and may signal the outcome of important policy decisions. Focusing on the issue of same-sex marriage legalization, we examine almost 2 million public Twitter posts related to same-sex marriage in the U.S. states over the course of 4 years starting from 2011.

Modeling Ideology and Predicting Policy Change with Social Media: Case of Same-Sex Marriage. Amy X. Zhang, Scott Counts. CHI '15.
Best of CHI Honorable Mention.
[pdf] [bibtex] [slides and talk] [video]


Visualizing Text Corpora to Compare Media Frames

We develop a visualization technique and visual analytic system that enables the study of media frames across text corpora. In particular our system allows scholars or other analysts to compare media frames in a visualization called the Compare Cloud, which explicitly maps word prevalence and context information between two corpora.

Compare Clouds: Visualizing Text Corpora to Compare Media Frames. Nick Diakopolous, Dag Elgesem, Andrew Salway, Amy X. Zhang, Knuf Hofland. TextVis Workshop @ IUI '15.
[pdf] [bibtex]


Spreadsheet-backed Web Apps

We present a system for creating basic web applications using such spreadsheets in place of a server and using HTML to describe the client UI. Authors connect the two by placing spreadsheet references inside HTML attributes. Data computation is provided by spreadsheet formulas. The result is a reactive read-write-compute web page without a single line of Javascript code.

Cloudstitch: Beautiful Apps without the Programming Hassle by Rough Draft Ventures
Now a startup called Cloudstitch at http://www.cloudstitch.com

Spreadsheet Driven Web Applications. Edward Benson, Amy X. Zhang, David Karger. UIST '14.
[pdf] [bibtex] [video]

Kinetic Scrolling on Mobile and Tablets

To support navigation of long documents on touchscreen devices, we introduce content-aware kinetic scrolling, a novel scrolling technique that dynamically applies pseudo-haptic feedback in the form of friction around points of high interest within the page. This allows users to quickly find interesting content while exploring without further cluttering the limited visual space.

Content-Aware Kinetic Scrolling for Supporting Web Page Navigation. Juho Kim, Amy X. Zhang, Jihee Kim, Robert Miller, Krzysztof Gajos. UIST '14.
[pdf] [bibtex] [video]

Moral Framing in Climate Change Blog Discourse

In this work we develop a novel operationalization of moral evaluation frames and study their use within a corpus of blogs discussing climate change. We develop a text visualization tool called Lingoscope that allows the user to observe and filter the contextual terms that convey moral framing across large volumes of text, as well as to drill down to specific examples.

Identifying and Analyzing Moral Evaluation Frames in Climate Change Blog Discourse. Nick Diakopoulos, Amy X. Zhang, Dag Elgesem, Andrew Salway. ICWSM '14.
[pdf] [bibtex] [poster]

Controversy and Sentiment in Online News

In this work, we take a data-driven approach to understand how controversy interplays with emotional expression and biased language in the news. We begin by introducing a new dataset of controversial and noncontroversial terms collected using crowdsourcing. Then, focusing on 15 major U.S. news outlets, we compare millions of articles discussing controversial and non-controversial issues over a span of 7 months.

Why Media Bias Has Nowhere to Run and Hide from Data Science by CrowdFlower

Controversy and Sentiment in Online News. Yelena Mejova, Amy X. Zhang, Nicholas Diakopoulos, Carlos Castillo. Computation + Journalism '14.
[pdf] [bibtex]



Hoodsquare: Defining Neighborhoods

Information garnered from activity on location-based social networks can be harnessed to characterize urban spaces and organize them into neighborhoods. We adopt a data-driven approach to the identification and modeling of urban neighborhoods using location-based social networks.

Good Neighbours by Gates Cambridge

Hoodsquare: Modeling and Recommending Neighborhoods in Location-based Social Networks. Amy X. Zhang, Anastasios Noulas, Salvatore Scellato, and Cecilia Mascolo. SocialCom '13.
[pdf] [bibtex] [slides and talk] [website]


Visual Analytics of Media Frames

There is interest in trying to identify new frames around issues, and to compare how types of frames vary across different news outlets, or over time. We consider these analytic needs in the context of two use-cases relating to news producers and news consumers, and describe the initial design of a visual analytics tool, the LingoScope, in terms of how it supports these use-cases.

Visual Analytics of Media Frames in Online News and Blogs. Nick Diakopolous, Amy X. Zhang, and Andrew Salway. TextVis Workshop @ InfoVis '13.
[pdf] [bibtex]



Diurnal Urban Routines on Twitter

We study and characterize diurnal patterns in social media data for different urban areas, with the goal of providing context and framing for reasoning about such patterns at different scales. Using one of the largest datasets to date of Twitter content associated with different locations, we examine within-day variability and across-day variability of diurnal keyword patterns for different locations.

On the Study of Diurnal Urban Routines on Twitter. Mor Naaman, Amy X. Zhang, Samuel Brody, and Gilad Lotan. ICWSM '12.
[pdf] [bibtex]

Contact Info

Cambridge, MA 02139

Email: axz@mit.edu
Twitter: @amyxzh
Github: amyxzhang
Eyebrowse: amyxzhang

Latest News

Oct 2018: Received CSCW Best Paper Award for paper on making sense of group chat.

May 2017: Summer internship at Microsoft Research with Justin Cranshaw and Andres Monroy Hernandez.

May 2016: Summer internship at Google Research with Praveen Paritosh.

Mar 2016: Recipient of a Google Research PhD Fellowship!

May 2015: Summer internship at Google Research with Jilin Chen and Ed Chi.

Mar 2015: Received CHI Best Paper Honorable Mention Award for paper on Twitter and policy change.

Jun 2014: Starting internship at MSR Redmond with Scott Counts.

Apr 2014: Recipient of the NSF Graduate Research Fellowship.

Feb 2014: Officially started PhD program at MIT!

Upcoming Travel

Oct 2018: Pittsburgh - CMU
Oct 2018: MIT - EECS Rising Stars Symposium
November 2018: NYC - CSCW

Latest Blog Posts

4 Things We Learned from Talking to People who Face Harassment: Research behind Squadbox - April 17, 2018
Considering End Users in the Design of News Credibility Annotations - June 12, 2017
Year in Reviews - Dec 30, 2016
Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? - May 11, 2015
Thoughts and Reflections from a 1st Time Reviewer - March 19, 2014
What is a Neighborhood? - March 11, 2012

Tweets By @amyxzh

This site built by © Amy X. Zhang, github code here. Last updated May 2016.