Leveraging Light-Weight Analyses to Aid Software Maintenance

Fry, Zachary, Computer Science - School of Engineering and Applied Science, University of Virginia
Weimer, Westley, Computer Science, University of Virginia

While software systems have become a fundamental part of modern life, they
require maintenance to continually function properly and to adapt to
potential environment changes. Software maintenance,
a dominant cost in the software lifecycle, includes
both adding new functionality and fixing existing problems, or
"bugs," in a system. Software bugs cost the world's economy
billions of dollars annually in terms of system down-time and the
effort required to fix them.

This dissertation focuses specifically on corrective software
maintenance --- that is, the process of finding and fixing bugs.
Traditionally, managing bugs has been a largely manual process. This historically involved developers treating each
defect as a unique maintenance concern, which results in a slow
process and thus a high aggregate cost for finding and fixing bugs.
Previous work has shown that bugs are often reported more rapidly than
companies can address them, in practice.

Recently, automated techniques have helped to ease the human burden
associated with maintenance activities. However, such techniques
often suffer from a few key drawbacks. This thesis argues that
automated maintenance tools often target narrowly scoped problems
rather than more general ones. Such tools favor maximizing local,
narrow success over wider applicability and potentially greater cost
benefit. Additionally, this dissertation provides evidence that
maintenance tools are traditionally evaluated in terms of functional
correctness, while more practical concerns like ease-of-use and
perceived relevance of results are often overlooked. When calculating
cost savings, some techniques fail to account for the introduction of
new workflow tasks while claiming to reduce the overall human burden.
The work in this dissertation aims to avoid these weaknesses by
providing fully automated, widely-applicable techniques
that both reduce the cost of software maintenance and meet
relevant human-centric quality and usability standards.

This dissertation presents software maintenance techniques that reduce
the cost of both finding and fixing bugs, with an emphasis on
comprehensive, human-centric evaluation. The work in this thesis uses
lightweight analyses to leverage latent information inherent in
existing software artifacts. As a result, the associated techniques
are both scalable and widely applicable to existing systems. The
first of these techniques clusters closely-related,
automatically generated defect reports to aid in the process of bug
triage and repair. This clustering approach is complimented by an
automatic program repair technique that generates and validates
candidate defect patches by making sweeping optimizations to a
state-of-the-art automatic bug fixing framework. To fully evaluate
these techniques, experiments are performed that show net cost savings
for both the clustering and program repair approaches while also
suggesting that actual human developers both agree with the resulting
defect report clusters and also are able to understand and use
automatically generated patches.

The techniques described in this dissertation are designed to address
the three historically-lacking properties noted above: generality,
usability, and human-centric efficacy. Notably, both presented
approaches apply to many types of defects and systems, suggesting they
are generally applicable as part of the maintenance process.
With the goal of comprehensive evaluation in mind, this thesis
provides evidence that humans both agree with the results of the
techniques and could feasibly use them in practice. These and other
results show that the techniques are usable, in terms of both
minimizing additional human effort via full automation and also
providing understandable maintenance solutions that promote continued
system quality. By evaluating the associated techniques on programs
spanning different languages and domains that contain thousands of bug
reports and millions of lines of code, the results presented in this
dissertation show potential concrete cost savings with respect to
finding and fixing bugs. This work suggests the feasibility of
further automation in software maintenance and thus increased
reduction of the associated human burdens.

PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)
Issued Date: