We have found experimental evidence that reducing the time between the introduction of an error and its discovery by a developer can lead to improvements in overall development time. This evidence is collected by high-resolution background monitoring of developer behavior, and analyzed using a model that infers the developer's beliefs and intent from their recorded actions. This model is then used to drive simulations of developer behavior and productivity in response to different environments, analyze the impact of changing the frequency of testing and prioritizing tests within a suite, and show that continuous testing promises even greater improvements.
The test suite in use by a developer may contain a test that covers a large fraction of the program code, or takes a long time to run. It may prove difficult for a continuous testing tool to provide early feedback when that test regresses, that is, when a change to the program being tested causes the test to stop passing and begin failing. To improve this situation, we are proposing test factoring. Test factoring, given a large test, produces one or more smaller tests. Each of these smaller tests is unlikely to fail unless the large test fails, and likely to regress when the large test regresses due to a particular kind of program change.