In our quest to invent and improve automation, computer scientists must not relinquish humanity's ability to program computers. Our reaction to a complex world should be to create transparency.
As a PhD student in Antonio Torralba's lab, I develop techniques that increase interpretability of deep neural networks. I have found evidence that the representations learned by deep networks can have a simple underlying structure. The focus of my research at MIT is therefore to find ways to enhance and exploit this structure to make deep networks more transparent.
I also work on the wider human-computer interaction problem of programmability. I develop novice programming interfaces, lessons, standards, and libraries that make systems more programmable.
Network Dissection is a technique for quantifying and automatically estimating the human interpretability (and interpretation) of units within any deep neural network for vision. Building upon a surprising 2014 finding by Bolei Zhou, network dissection defines a dictionary of 1197 human-labeled visual concepts, each represented as a segmentation problem, then it estimates interpretability by evaluating each hidden convolutional unit as a solution to those problems. I have used network dissection to reveal that representation space is not isotropic: learned representations have an unusually high agreement with human-labeled concepts that vanishes under a change in basis. We will be giving an oral presentation about the technique and the insights it provides at CVPR 2017. D Bau, B Zhou, A Khosla, A Oliva, and A Torralba. Network Dissection: Quantifying the Intepretability of Deep Visual Representations. In Computer Vision and Pattern Recognition 2017.
Blocks and Beyond is a workshop I helped organize to bring together researchers who are investigating blocked-based interfaces to simplify programming for novices and casual programmers. The workshop was oversubscribed, and the presented work was interesting both for its breadth and depth. Afterwards, we wrote a review paper to survey the history, foundations, and state-of-the-art in the field. The review appears in the June 2017 Communications of the ACM; also see the video overview. D Bau, J Gray, C Kelleher, J Sheldon, F Turbak. Learnable Programming: Blocks and Beyond. Communications of the ACM, pp. 72-80. June 2017.
Pencil Code is an open-source project that bridges the gap between novice programming idioms and professional languages. Developed together with my son and with the generous support of Google, this system provides a blocks-based editing environment with turtle graphics on a canvas that smoothly transitions to text-based editing of web applications using jQuery. Two thousand students use the system each day. A study of middle-school students using the environment suggests suggests the block-and-text transitions are an aid to learning. D Bau, D A Bau, M Dawson, C S Pickens. Pencil code: block code for a text world. In Proceedings of the 14th International Conference on Interaction Design and Children, pp. 445-448. ACM, 2015.
Google Image Search is the world's largest searchable index of images. I contributed several improvements to this product, including improved ranking for recent images, a clustered broswing interface for discovering images using related searches, a rollout of new serving infrastructure to support a long-scrolling result page serving one thousand image results at a time, and improvements in the understanding of person entities on the web. M Zhao, J Yagnik, H Adam, D Bau, Large scale learning and recognition of faces in web videos. In Automatic Face & Gesture Recognition, 2008. FG'08. 8th IEEE International Conference on (pp. 1-7). IEEE, September 2008.
Google Talk is a web-based chat solution that was built-in to GMail. I led the team to create Google Talk in an (ultimately unsuccessful) attempt to establish a universal federated open realtime communication ecosystem for the internet. Our messaging platform provided full-scale support for XMPP and Jingle, which are open standards for federating real-time chat and voice that are analogous to the open-for-all SMTP system for email. When these open protocols came under asymmetric attack by Microsoft (they provided only one-way compatibility), Google relented and reverted to a closed network. To this day, open realtime communications remains an unfulfilled dream for the internet. D Bau. Google Gets to Talking. Google Official Blog, August 2005.
Apache XML Beans is an open-source implementation of the XML Schema specification as a compiler from schema types to Java classes. While no longer widely used, my team's implementation of this standard is still a good example of an important approach that continues to be a key technique for the creation of understandably complex systems: the prioritization of faithful and transparent data representations over simplified but opaque functional encapsulations. D Bau. The Design of XML Beans, davidbau.com, a dabbler's weblog, November 2003.
Microsoft Internet Explorer 4 was the first AJAX web browser. As part of the Trident team led by Adam Bosworth, I helped create the first fully mutable HTML DOM by defining its asynchronous loading model. My contribution was to implement an incremental HTML parser and fast multithreaded preloader for linked resources that was able to co-exist with single-threaded scripts that could change the document. The design of the system resolved tensions between performance, flexiblity, and programmability, and contributed to the strength of the modern web platform.
Numerical Linear Algebra is graduate textbook on numerical linear algebra I wrote with my advisor Nick Trefethen while at Cornell. The book began as a detailed set of notes that I took while attending Nick's course. The writing is intended to capture the spirit of his teaching: succinct and insightful. The hope is to reveal the elegance of this family of fundamental algorithms and dispel the myth that finite-precision arithmetic means imprecise thinking. L N Trefethen, D Bau. Numerical linear algebra. Vol. 50. Siam, 1997.