Monday, January 16, 2017

ICSE 2017 Most Influential Paper, and supporting your research tools

My ICSE 2007 paper that describes the Randoop test generation tool, Feedback-directed random test generation (coauthored with Carlos Pacheco, Shuvendu K. Lahiri, and Thomas Ball), has won the ICSE 2017 Most Influential Paper award.  This is a test-of-time award given for "the paper from the ICSE meeting 10 years prior that is judged to have had the most influence on the theory or practice of software engineering during the 10 years since its original publication."

Randoop generates tests for programs written in object-oriented languages; currently, Java and .NET versions exist. Automated test generation is a practically important research topic: in 2002, NIST reported, "the national [US] annual costs of an inadequate infrastructure for software testing is estimated to range from $22.2 to $59.5 billion."

Prior to this research, a typical test generation technique would generate many tests and then try to determine which ones were of value. For example, an error-revealing test is one that makes legal calls that yield incorrect results.  Without a formal specification, it is difficult to know whether a given call is legal and whether its outcome is desired.  Furthermore, multiple generated tests might fail for the same reason.

The technique of feedback-directed test generation generates one test at a time, executes the test, and classifies it as (probably) a normal execution, a failure, or an illegal input.  Based on this information, it biases the subsequent generation process to extend good tests and avoid bad tests.

Feedback-directed test generation was first introduced in my ECOOP 2005 paper Eclat: Automatic generation and classification of test inputs (coauthored with Carlos Pacheco). The ICSE 2007 Randoop paper expands on the technique, contributes a more usable tool, and reports on more extensive experiments.

Before and since, many other test generation strategies have been proposed, including ones with desirable theoretical properties such as the ability to cover difficult-to-execute paths.  Some of these have seen commercial success.  Thanks to its scalability and simplicity, Randoop remains the standard benchmark against which other test generation tools are measured.  (Randoop is easy to use, but as with any tool, tuning parameters as described in its manual greatly improves its output, and that is what a conscientious user or researcher does.)  It's great to see subsequent research surpassing Randoop.  For example, the GRT tool adds a few clever techniques to Randoop and outperforms all other test generation tools.

One reason the Randoop paper had impact was its innovative approach to test generation, which has since been adopted by other tools.  An equally important reason is a decade of improvements, bug fixes, and other support.  After my student Carlos Pacheco graduated and moved on to other interests, I and my research group have continued to maintain Randoop for its community of industrial developers and academics.  This enables them to use Randoop to find bugs and generate regression tests, or to extend Randoop to support their own research.  The main Randoop paper or the tool have been cited 761 times in academic papers.

At ICSE 2007, I was awarded a ACM Distinguished Paper Award (a "best paper" award).  But the award wasn't for the Randoop paper!  Don't give up if others do not yet appreciate your work.  Time can change people's opinions about what research results are most important.  At the time, the program committee was more impressed with my paper "Refactoring for parameterizing Java classes" (coauthored with Adam Kieżun, Frank Tip, and Robert M. Fuhrer).  This paper shows how to infer Java generics, such as converting a Java program from declaring and using List to declaring and using List<T>.  The paper is notable because it solves both the parameterization problem and the instantiation problem. That is, it infers what type parameters a generic class should have, as well as changing clients to provide type arguments when using the class.

The ICSE 2017 MIP award is my second test-of-time award.  I also received the 2013 ACM SIGSOFT Impact Paper Award, which is awarded to "a highly influential paper presented at a SIGSOFT-sponsored or co-sponsored conference held at least 10 years prior".  The award was for my ICSE 1999 paper Dynamically discovering likely program invariants to support program evolution (co-authored with Jake Cockrell, William G. Griswold, and David Notkin).  The paper describes a machine-learning technique that observes program executions and outputs likely specifications.  This was the first paper that described the Daikon invariant detector.

As with Randoop, a reason for Daikon's influence is that I and my research group maintained it for over a decade, supporting a community of users.  I did so against the advice of senior faculty who told me to abandon it and focus on new publications.

I recommend my approach to other researchers who care about impact.  Make your research tools public!  Support them after you release them!  This is a requirement of the scientific method, which requires replicating and extending prior research.  If you are not willing to do this, others are right to suspect your results.  Another important reason is that maintenance work helps others in the scientific community to do their own work.  If you need a selfish reason, it greatly increases your influence, as illustrated by these awards.

No comments: