Maisqual Metrics

From Maisqual Wiki

Jump to: navigation, search

This page lists the metrics retrieved for the different analyses performed on projects.


Contents

Availability of metrics

The set of available metrics depends on the type of artefact (e.g. application, file, function) and data set (weekly, releases, version), and on the characteristics of the project (e.g. object-oriented).


Availability of metrics across the data set types.
Source Code SCM COM
Common OO Diff. Total Time Time
Java Evolution X X X X X X
Java Releases X X X X
Java Versions X X X
C Evolution X X X X X
C Releases X X X
C Versions X X


Some metrics are only available for specific contexts, like the number of classes for object-oriented programming. Metrics available for each type of data set are described in the table above. Object-Oriented metrics are CLAS (number of classes defined in the artefact) and DITM (Depth of Inheritance Tree). Diff metrics are LADD, LMOD, LREM (Number of lines added, modified and removed since the last analysis). Time metrics for SCM are SCM_*_1W, SCM_*_1M, AND SCM_*_3M. Total metrics for SCM are SCM_COMMITS_TOTAL, SCM_COMMITTERS_TOTAL, SCM_COMMITS_FILES_TOTAL and scm_fixes_total. Time metrics for Communication are COM_*_1W, COM_*_1M, and COM_*_3M.

Metrics defined on a lower level (e.g. function) can be aggregated to upper level in a smart manner: as an example, the cyclomatic number at the file level is the overall sum of its function’s cyclomatic numbers. The meaning of the upper-level metric shall be interpreted with this fact in mind, since it may introduce a bias (also known as the Ecological fallacy[1]). When needed, the smart manner used to aggregate information at upper levels is described hereafter.

All data sets are structured in three files, corresponding to the different artefact types that were investigated: application, file and function.

Common source code metrics

These metrics are computed on the source code of projects by the SQuORE analyser. Releases data sets are extracted from the compressed source tarbals directly downloaded from the official location on the project web site. All other types of data sets are extracted from software configuration management repositories (either CVS, Subversion or Git). Next table presents the code metrics that are available in all data sets, with the artefact levels they are available at.


Availability of common source code metrics across artefact types.
Artefact Counting metrics Mnemo Appli. File Func.
Number of files FILE X
Number of functions FUNC X X
Line Counting metrics Mnemo Appli. File Func.
Lines of braces BRAC X X X
Blank lines BLAN X X X
Effective lines of code ELOC X X X
Source lines of code SLOC X X X
Line Count LC X X X
Mixed lines of code MLOC X X X
Comment lines of code CLOC X X X
Miscellaneous Mnemo Appli. File Func.
Non conformities count NCC X X X
Comment rate COMR X X X
Number of statements STAT X X X
Control flow complexity metrics Mnemo Appli. File Func.
Non conformities count NCC X X X
Comment rate COMR X X X
Number of statements STAT X X X
Miscellaneous Mnemo Appli. File Func.
Maximum nesting level NEST X
Number of Paths NPAT X
Cyclomatic number VG X X X
Control flow tokens CFT X X X
Halstead metrics Mnemo Appli. File Func.
Total number of operators TOPT X
Number of distinct operators DOPT X
Total number of operand TOPD X
Number of distinct operands DOPD X


Artefact counting metrics

Artefact counting metrics include the number of files and number of functions.

  • The number of files (FILE) counts the number of source files in the project, i.e. which have an extension corresponding to the defined language (.java for Java or .c and .h files for C).
  • The number of functions (FUNC) sums up the number of methods or functions recursively defined in the artefact.


Line counting metrics

Line counting metrics propose a variety of different means to grasp the size of code from different perspectives. It includes STAT, SLOC, ELOC, CLOC, MLOC, and BRAC.

  • The number of statements (STAT) counts the total number of instructions. Examples of instructions include control-flow tokens, plus else, cases, and assignments.
  • Source lines of code (SLOC) is the number of non-blank and non-comment lines in code.
  • Effective lines of code (ELOC) also removes the number of lines that contain only braces.
  • Comment lines of code (CLOC) counts the number of lines that include a comment in the artefact. If a line includes both code and comment, it will be counted in SLOC, CLOC and MLOC metrics.


Control flow complexity metrics

  • The maximum nesting level (NEST) counts the highest number of imbricated code (including conditions and loops) in a function. Deeper nesting threatens understandability of code and induces more test cases to run the different branches. Practitioners usually consider that a function with three or more nested levels becomes significantly more difficult for the human mind to apprehend how it works.
  • The number of execution paths (NPAT) is an estimate of the possible execution paths in a function. Higher values induce more test cases to test all possible ways the function can execute depending on its parameters. An infinite number of execution paths typically indicates that some combination of parameters may cause an infinite loop during execution.
  • The cyclomatic number (VG), a measure borrowed from graph theory and introduced by McCabe[2] is the number of linearly independent paths that comprise the program. To have good testability and maintainability, McCabe recommends that no program modules (or functions as for Java) should exceed a cyclomatic number of 10. It is primarily defined at the function level and is summed up for higher levels of artefacts.
  • The number of control-flow tokens (CFT) counts the number of control-flow oriented operators (e.g. if, while, for, switch, throw, return, ternary operators, blocks of execution). else and case are typically considered a part of respectively if and switch and are not counted.

The control flow graph of a function visually plots all paths available when executing it. Examples of control flow are provided in figure 6.1; figures 6.1a and 6.1b shows two Java examples on Ant (the code of these functions is reproduced in appendix D.1 page 267 for reference) and figure 6.1c shows a C example extracted from GCC. But control flows can be a lot more complex, as exemplified in figure 6.1d for an industrial application.


Halstead metrics

Halstead proposed in his Elements of Software Science[3] a set of metrics to estimate some important characteristics of a software. He starts by defining 4 base measures:

  • the number of distinct operands (DOPD, or n2 ),
  • the number of distinct operators (DOPT, or n1 ),
  • the total number of operands (TOPD, or N2 ), and
  • the total number of operators (TOPT, or N1 ).

Together they constitute the following higher-level derived metrics:

  • program vocabulary: n = n1 + n2,
  • program length: N = N1 + N2,
  • program difficulty: D = \frac{n_1}{2} \times \frac{N_2}{n_2}
  • program volume: V = Nlog 2(n),
  • estimated effort needed for program comprehension: E = D \times V ,
  • estimated number of delivered bugs: B = \frac{E^{2/3}}{3000}


In the data sets, only the four base measures are retrieved: DOPD, DOPT, TOPD, and TOPT. Derived measures are not provided in the data sets since they can all be computed from the provided base measures.


Rules-oriented measures

  • NCC is the number of non-conformities detected on an artefact. From the practices perspective, it sums the number of times all rules have been transgressed on the artefact (application, file or function).
  • The rate of acquired practices (ROKR) is the number of respected rules (i.e. with no violation detected on the artefact) divided by the total number of rules defined for the run. It shows the number of acquired practices with regards to the full rule set.

Specific source code metrics

Some souce code metrics are only available in specific context: e.g. differential measures, which are available for Weekly and Releases data sets only, and object-oriented measures, which are only available for.. object-oriented code, you got it.


Availability of specific source code metrics across artefact types.
Differential metrics Mnemo Appli. File Func.
Lines added LADD X X X
Lines modified LMOD X X X
Lines removed LREM X X X
Object-Oriented metrics Mnemo Appli. File Func.
Maximum depth of inheritance tree DITM X
Number of classes CLAS X X X
Rate of acquired rules ROKR X X X


Differential measures

Differential measures are only available for evolution and release data sets. They quantify the number of lines added (LADD), lines modified (LMOD) or lines removed (LREM) since the last analysis, be it a week (for evolution data sets) or a random delay between two releases (which varies from one week to one year). They are computed using Perl's diff algorithm and use the same semantic that the usual diff and patch vocabulary. They give an idea about the volume of changes (either bug fixes or new features) that occured between two releases. In the case of large refactoring, or between major releases, there may be a massive number of lines modified, whereas only small increments may be displayed for more agile-like, often-released projects (e.g. Jenkins).


Object-oriented measures

Three measures are only available for object-oriented code. They are the number of classes (CLAS), the maximum depth of inheritance tree (DITM), and the above-mentionned rate of acquired rules (ROKR).

  • The number of classes (CLAS) sums up the number of classes defined in the artefact and its sub-defined artefacts. One file may include several classes, and in Java anonymous classes may be included in functions.
  • The maximum depth of inheritance tree (DITM) of a class within the inheritance hierarchy is defined as the maximum length from the considered class to the root of the class hierarchy tree and is measured by the number of ancestor classes. In cases involving multiple inheritance, the ditm is the maximum length from the node to the root of the tree[4]. It is available solely at the application level.

A deep inheritance tree makes the understanding of the object-oriented architecture difficult. Well structured OO systems have a forest of classes rather than one large inheritance lattice. The deeper the class is within the hierarchy, the greater the number of methods it is likely to inherit, making it more complex to predict its behavior and, therefore, more fault-prone[5]. However, the deeper a particular tree is in a class, the greater potential reuse of inherited methods[4].

The rate of acquired practices (ROKR, described in above section) is another measure specific to object-oriented code, since it is computed relatively to the full number of rules, including Java-related checks. In the case of C projects only the SQuORE rules are checked, so it loses its meaning and is not generated.


Configuration Management metrics

Configuration management systems hold a number of meta-information about the modifications committed to the project repository. The following metrics are defined:


Availability of configuration management metrics across artefact types.
Configuration management metrics Mnemo Appli. File Func.
Number of commits SCM_COMMITS X X
Number of fixes SCM_FIXES X X
Number of distinct committers SCM_COMMITTERS X X
Number of files committed SCM_COMMITTED_FILES X


  • The number of commits (SCM_COMMITS) counts the commits registered by the software configuration management tool for the artefact on the repository (either the trunk or a branch). At the application level commits can concern any type of artefacts (e.g. code, documentation, or web site). Commits can be executed for many different purposes: e.g. add feature, fix bug, add comments, refactor, or even simply re-indent code.
  • The number of fixes (SCM_FIXES) counts the number of fix-related commits, i.e. commits that include either the fix, issue, problem or error keywords in their message. At the application level, all commits with these keywords in message are considered until the date of analysis. At the file level, it represents the number of fix-related revisions associated to the file. If a file is created while fixing code (i.e. its first version is associated to a fix commit) the fix isn’t counted since the file cannot be considered responsible for a problem that has been detected when it wasn’t there.
  • The number of distinct committers (SCM_COMMITTERS) is the total number of different committers registered by the software configuration management tool on the artefact. On the one hand, having less committers enforces cohesion, makes keeping coding and naming conventions respected easier, and allows easy communication and quick connection between developers. On the other hand, having a large number of committers means the project is active; it attracts more talented developers and more eyes to look at problems. The project has also better chances to be maintained over the years.

It should be noted that some practices may threaten the validity of this metric. As an example occasional contributors may send their patches to official maintainers who review it before integrating it in the repository. In such cases, the commit is executed by the official committer, although the code has been originally modified by an anonymous (at least for us) developer. Some core maintainers use a convention stating the name or identifier of the contributor, but there is no established or enforced usage of such conventions. Another point is that multiple online personas can cause individuals to be represented as multiple people[6].

  • The number of files committed (SCM_COMMIT_FILES) is the number of files associated to commits registered by the software configuration management tool. This measure allows to identify big commits, which usually imply big changes in the code.

To reflect recent activity on the repository, we retrieved measures both on a limited time basis and on a global basis: in the last week (e.g. SCM_COMMITS_1W), in the last month (e.g. SCM_COMMITS_1M), and in the last three months (e.g. SCM_COMMITS_3M), and in total (e.g. SCM_COMMITS_TOTAL).


Communication metrics

Communication metrics show an unusual part of the project: people’s activity and interactions during the elaboration of the product. Most software projects have two communication media: one targeted at the internal development of the product, for developers who actively contribute to the project by committing in the source repository, testing the product, or finding bugs (a.k.a. developers mailing list); and one targeted at end-users for general help and good use of the product (a.k.a. user mailing list).

The type of media varies across the different forges or projects: most of the time mailing lists are used, with a web interface like MHonArc or mod_mbox. In some cases, projects may use as well forums (especially for user-oriented communication) or NNTP news servers, as for the Eclipse foundation projects. The variety of media and tools makes it difficult to be extensive; however data providers can be written to map these to the common mbox format. We wrote connectors for mboxes, MHonArc, GMane and FUDForum (used by Eclipse).

We retrieve the following metrics on application artefacts:

  • The number of posts (COM_DEV_VOL, COM_USR_VOL) is the total number of mails posted on the mailing list during the considered period of time. All posts are counted, regardless of their depth (i.e. new posts or answers).
  • The number of distinct authors (COM_DEV_AUTH, COM_USR_AUTH) is the number of people having posted at least once on the mailing list during the considered period of time. Authors are counted once even if they posted multiple times, based on their email address.
  • The number of threads (COM_DEV_SUBJ, COM_USR_SUBJ) is the number of different subjects (i.e. a question and its responses) that have been posted on the mailing list during the considered period of time. Subjects that are replies to other subjects are not counted, even if the subject text is different.
  • The number of answers (COM_DEV_RESP_VOL, COM_USR_RESP_VOL) is the total number of replies to requests on the user mailing list during the considered period of time. A message is considered as an answer if it is using the Reply-to header field. The number of answers is often associated to the number of threads to compute the useful response ratio metric.
  • The median time to first reply (COM_DEV_RESP_TIME_MED, COM_USR_RESP_TIME_MED) is the number of seconds between a question (first post of a thread) and the first response (second post of a thread) on the mailing list during the considered period of time.

As for configuration management metrics, we worked on temporal measures to produce measures for the last week, last month, and last three months. Communication metrics are only available at the application level.


References

  1. Posnett, D., Filkov, V., & Devanbu, P. (2011). Ecological inference in empirical software engineering. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (pp. 362–371). IEEE Computer Society. doi:10.1109/ASE.2011.6100074
  2. McCabe, T. (1976). A complexity measure. Software Engineering, IEEE Transactions on, (4), 308–320.
  3. Halstead., M. H. (1977). Elements of Software Science. Elsevier Science Inc.
  4. 4.0 4.1 Shyam R. Chidamber, & Chris F. Kemerer. (1993). A Metrics Suite for Object Oriented Design.
  5. Gustafson, D. A., & Prasad, B. (1991). Properties of software measures. In Formal Aspects of Measurement, Proceedings of the BCS-FACS Workshop on Formal Aspects of Measurement, South Bank University, London, 5 May 1991 (pp. 179–193). Springer.
  6. Hemmati, H., Nadi, S., Baysal, O., Kononenko, O., Wang, W., Holmes, R., & Godfrey, M. W. (2013). The MSR Cookbook. In 10th International Workshop on Mining Software Repositories (pp. 343–352).
Personal tools