Monitoring Software Quality Evolution for Defects

From Maisqual Wiki

Jump to: navigation, search

Monitoring Software Quality Evolution for Defects

Hongyu Zhang, Tsinghua University

Sunghun Kim, Hong Kong University of Science and Technology



  1. Abstract
  2. Monitoring Quality Evolution Unsing the C-Chart
    1. Eclipse Search
    2. Gnome GnuCash
  3. Quality Evolution Patterns
    1. Downward Trend
    2. Upward Trend
    3. Hills
    4. Small Variations
    5. Roller Coaster
  4. Conclusion


 author = {Hongyu Zhang and Sunghun Kim},
 title = {Monitoring Software Quality Evolution for Defects},
 journal ={IEEE Software},
 volume = {27},
 issn = {0740-7459},
 year = {2010},
 pages = {58-64},
 doi = {},
 publisher = {IEEE Computer Society},
 address = {Los Alamitos, CA, USA}, 


The file can be downloaded from here.


"Software evolution is the dynamic behavior or programming systems as they are maintained and enhanced over their lifetimes."[1]

Project teams should continually monitor and control software to ensure it follows desirable evolution paths.

Lehman's second law on evolution of software is Increasing complexity states that as the software evolves, growing complexity and increasing defects will lower stakeholder satisfaction unless project teams undertake the necessary work to maintain quality.

Authors use the C-Chart, a quality control chart widely adopted in statistical process control (SPC[2]) to study the quality evolution of two well-known, large-scale open source software systems: Eclipse and Gnome.

Monitoring Quality Evolution using the C-Chart

Control charts can monitor and detect process changes. The study took the confirmed defects for each month, counted the component-level defects for each calendar month, and plotted the data on c-chart using the default 3\sigma; control limits. The number of source code changes was also counted (including the added and deleted LOC for each file) on the basis of the configuration management repositories.

The defect plots in c-chart show that for constantly maintained and updated systems, the evolution is complicated rather than monotonic. Furthermore, processes were both relatively stable and unstable.

Eclipse Search

The number of confirmed defects in the Eclipse search component is plotted from June 2002 to October 2007. The average is 12.21 per month, upper control limit (UCL) is 22.70 and lower control limit (LCL) is 1.73.

The following phases can be observed:

  • From June 2002 to January 2004, relatively stable, average 9.63, 1535 lines modified per month.
  • From February to June 2004 (Eclipse 3.0 release), high raise, 2207 lines modified per month. Furthermore, a big architectural change has occurred (OSGi platform specification, new plugins can be installed without restarting eclipse).
  • From July 2004 to June 2005, quality under control, average 11.92, 758 lines modified per month.
  • From July 2005 to June 2007, quality under control, excepted for a sudden rise of defects in February 2006. In this month and the preceding, changes were respectively of 2720 LOC and 2021. The large-scale changes could have caused the rise.

Gnome GnuCash

The number of confirmed defects in GnuCash is plotted from June 2002 to September 2009. It shows that GnuCash has experienced dramatic quality changes.

  • From June 2002 to February 2003, number of defects tends to increase. It exceeds the UCL from October 2002 to February 2003. Some logs show a lot of new features and intensive development, with 3330 LOC per month.
  • A stable version of GnuCash has been released in February 2003, and the defects tends to decrease until December 2005.
  • From January 2006 to March 2007, number of defects rise again. Most months are close or above the UCL. Logs show that an architectural change has happened during this month (change from GTK1 to GTK2), with a lot of changes (3230 LOC changed per month), 9 unstable releases and 5 bug-fixing releases.
  • From April to July 2007, software quality didn't appear to improve, probably due to the port of GnuCash to Windows. 6 unstable releases were published before the delivery in July 2007. The first GnuCash BugDay has been held on the 21th of April; the project team obviously took QA actions considering the quality concerns.
  • From July 2007, the software entered a long maintenance period, with 9 bug-fixing releases (v2.2.1 to v2.2.9) and 870.18 LOC per month. The software quality gradually improved.

Quality Evolution Patterns

From a wide range of c-charts modeling, authors identified 6 common quality evolution patterns.

Downward Trend

This pattern represents a decreasing trend of defects in c-charts, which suggests that software quality tends to improve as it evolves.

Upward Trend

This pattern represents an increasing trend of defect numbers in c-charts, which suggests that quality is generally deteriorating as more defects are created with changes to the software. In such cases, the project team should immediately institute strict QA procedures (such as systematic testing and code review) to control software quality. The project team should also consider allocating more QA resources to the component.


This pattern represents a short, dramatic increase of defects in c-charts. Each impulse occurs beyond the UCL and contain one to three data points. Each impulse usually indicates a significant update in product features or a sudden change in organizational structures. However, the project teams managed to accommodate the changes and successfully put the software quality back on the track.


This pattern represents a long-lasting high number of defects in c-charts, which occurs beyond the UCL and contains more than three data points. This pattern suggests that the software experienced serious issues for a long time. Although the project team eventually got it back under control, the long period of poor quality could have adversely affected the software's reputation. In such cases, the project team should identify the problem's source and prevent them from recurring.

Small variations

This pattern represents small variations of defects numbers in c-charts. In this pattern, numbers of defects are relatively consistent. This pattern suggests that the software quality is apparently under control.

Roller Coaster

This pattern represents large variations of defect numbers in c-charts, with many data points close to or outside the control limits with large variations (close or above the 6σ range). This pattern suggests that the quality is unstable. Better management and planning must be adopted to ensure high and consistent quality.


C-charts and patterns can help QA teams better monitor quality evolution over a long period of time.

Quality evolution patterns are also useful for prioritizing QA efforts in practice, and understanding the overall quality history. For example, QA team could prioritize efforts for modules exhibiting roller coaster or upward trend patterns.

Control charts and patterns should be carefully interpreted in different contexts: stages of releases, degrees of changes, user activities, types of projects. We can't examine the control charts in isolation.

See also


  1. M. Lehman and L. Belady, Program Evolution: Processes of Software Changes, Academic Press, 1985.
  2. E. Grant and R. Leavenworth, Statistical Quality Control, McGraw-Hill, 1998.
Personal tools