Thesis Defense: Language Technologies for Understanding Law, Politics, and Public Policy

December 15, 2015

This was the announcement for my thesis defense:

Language Technologies for Understanding Law, Politics, and Public Policy

Seminar Series: 2015 Thesis Defense

Speaker: William Li

Speaker Affiliation: MIT CSAIL

Host: Andrew Lo

Date: Tuesday, December 15, 2015

Time: 1:00 PM to 2:30 PM

Location: 32-G882

This thesis focuses on machine learning techniques to uncover patterns and insights from large, text-based government datasets. First, we present a authorship attribution model on unsigned U.S. Supreme Court opinions, offering insights about the authorship of important cases and the dynamics of Supreme Court decision-making. Second, we apply software engineering metrics to analyze the complexity of the United States Code, revealing the structure and evolution of the U.S. Code over the past century. Third, we trace policy trajectories of bills in Congress, making it possible to visualize the contents of four key bills during the Financial Crisis. Finally, this thesis presents a novel model, Probabilistic Text Reuse (PTR), for finding repeated passages of text. Because text reuse occurs in legal and political documents because documents present similar ideas, different versions of documents are often quite similar, or because legitimate reasons for copying text exists. We illustrate the utility of PTR by capturing the structure of a large collection of public comments on the FCC's proposed regulations on net neutrality.

Previous: Reflections on DESIGN DIS(ABILITY)

Next: An Update

Tags: thesis defense graduate school

Notes

Thesis Defense: Language Technologies for Understanding Law, Politics, and Public Policy