Carlos Pacheco

Computer Science and Artificial Intelligence Laboratory, MIT

Welcome! I graduated in 2009 with a Ph.D. in Computer Science. My research focus was directed random testing.

I am currently a software engineer at Google.

Before joining Google, I spent a year as a strategy consultant at the Boston Consulting Group.

Check out Randoop, an open-source test generation tool for Java and .NET that has revealed errors in widely-used libraries from Sun, Microsoft, and others.


Directed Random Testing
Ph.D. dissertation, 2009
Directed random testing is a new approach to test generation that leverages the strengths of random testing (speed, scalability, effectiveness in revealing errors) while attacking its limitations (a tendency to generate many illegal and redundant tests). It does this by combining a bottom-up generation of tests with runtime guidance. Applied to 14 widely-used libraries, directed random testing quickly reveals many serious, previously unknown errors in the libraries. And compared with other test generation tools (model checking, symbolic execution, and traditional random testing), it reveals more errors and achieves higher code coverage. In an industrial case study, a test team at Microsoft using the technique discovered in fifteen hours of human effort as many errors as they typically discover in a person-year of effort using other testing methods.

Eclat: Automatic generation and classification of test inputs
Masters thesis, 2005
I describe automatic mechanism for selecting tests that are likely to expose errors---tests whose run-time behavior is maximally different from succeeding runs. The thesis also gives techniques for test input generation and for converting a test input into a test case.

Reasoning about TLA Actions
Undergraduate thesis, 2001
I used the ACL2 theorem prover to verify invariants of a distributed algorithm specified in TLA (Temporal Logic of Actions). The thesis describes the translation of TLA specifications into a finite set theory framework in ACL2, as well as the proof of two invariant properties of Disk Synod.

Refereed Publications

Automatically patching errors in deployed software
by Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard.
In Proceedings of the 21st ACM Symposium on Operating Systems Principles, 2009.
We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention.

Finding Errors in .NET with Feedback-directed Random Testing
by Carlos Pacheco, Shuvendu K. Lahiri, and Thomas Ball.
in ISSTA 2008: International Symposium on Software Testing and Analysis, 2008.
This paper presents a case study in which a team of test engineers at Microsoft applied a feedback-directed random testing tool to a critical component of the .NET architecture. Feedback-directed random testing tool found errors in the component that eluded previous testing, and did so two orders of magnitude faster than a typical test engineer (including time spent inspecting the results of the tool).

The Daikon system for dynamic detection of likely invariants
by Michael D. Ernst, Jeff H. Perkins, Philip J. Guo, Stephen McCamant, Carlos Pacheco, Matthew S. Tschantz, and Chen Xiao.
in Science of Computer Programming, 2007.
This overview paper discusses the Daikon tool, including its features, applications, architecture, and development process. It also explains Daikon's output and compares it to related tools.

Feedback-directed random test generation
by Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball.
in ICSE '07: Proceedings of the 29th International Conference on Software Engineering, 2007.
This paper presents a technique that improves random test generation by incorporating feedback from executing previously-generated inputs. The technique found serious, previously-unknown errors in widely-deployed applications.

Randoop: Feedback-directed Random Testing for Java
by Carlos Pacheco and with Michael D. Ernst.
in OOPSLA 2007 Companion, Montreal, Canada, 2007.
Randoop implements feedback-directed random test generation. This paper describes the Java version of Randoop.

Finding the needles in the haystack: Generating legal test inputs for object-oriented programs
by Shay Artzi, Michael D. Ernst, Adam Kieun, Carlos Pacheco, and Jeff H. Perkins.
in 1st Workshop on Model-Based Testing and Object-Oriented Systems (M-TOOS), 2006.
An automatically inferred model of legal method call sequences can bias test generation toward legal method sequences, increasing coverage and creating data structures beyond the ability of undirected random generation.

An empirical comparison of automated generation and classification techniques for object-oriented unit testing
by Marcelo d'Amorim, Carlos Pacheco, Darko Marinov, Tao Xie, and Michael D. Ernst.
in ASE 2006: Proceedings of the 21st Annual International Conference on Automated Software Engineering, 2006.
This paper experimentally evaluates four test generation strategies: the four combinations of model-based vs. exception-based, and symbolic vs. random. The model-based symbolic combination is new.

Eclat: Automatic generation and classification of test inputs
by Carlos Pacheco and Michael D. Ernst.
in ECOOP 2005 --- Object-Oriented Programming, 19th European Conference, 2005.
This paper presents an automatic mechanism for selecting tests that are likely to expose errors---tests whose run-time behavior is maximally different from succeeding runs. The paper also gives techniques for test input generation and for converting a test input into a test case.


Automated Testing of Distributed Systems
by Nathan Boy, Jared Casper, Carlos Pacheco, and Amy Williams.
Final project report for MIT 6.824: Distributed Computer Systems, 2004.
We present a technique to test servers that interact with clients using the Sun RPC protocol. The technique requires the user to provide two things: a list of RPC calls for the server being tested, and a set of invariants that are required to hold over the RPC communications trace between a set of clients and the server. The technique works by generating random sequences of RPC calls and checking that the invariants hold over the traces. If an invariant is violated, the violating sequence of RPC calls is reported to the user. We report the results of our testing a block server and a lock server.

Verifying TLA+ Invariants with ACL2
by Carlos Pacheco.
Research report for the Educational Advancement Foundation, 2001.
We describe the use of the ACL2 theorem prover to model and verify properties of TLA+ specifications. We have written a translator whose input is a TLA+ specification along with conjectures and structured proofs of properties of the specification. The translator's output is an ACL2 model of the specification, and a list of ACL2 conjectures corresponding to those sections of the proof outlines flagged for mechanical verification. We have used our tools to translate the Disk Synod algorithm, and to verify two invariants of the algorithm.

(Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.)

Part of this webpage was created with bibtex2web.