Maisqual Metrics

From Maisqual Wiki

Revision as of 21:16, 27 February 2014 by Bbaldassari (Talk | contribs)
Jump to: navigation, search

This page lists the metrics retrieved for the different analyses performed on projects.

Contents

Availability of metrics

The set of available metrics depends on the type of artefact (e.g. application, file, function) and data set (weekly, releases, version), and on the characteristics of the project (e.g. object-oriented).

Common OO Diff. Total Time Time
Java Evolution X X X X X X
Java Releases X X X X
Java Versions X X X
C Evolution X X X X X
C Releases X X X
C Versions X

Source code metrics

We retrieve the following metrics on function artefacts:

  • CLOC
  • DOPD
  • DOPT
  • GOTO
  • SLOC
  • NEST
  • NOP
  • NPAT
  • TOPD
  • TOPT
  • VG


Configuration Management metrics

Application level

We retrieve the following metrics on application artefacts:

  • SCM_COMMITS: number of commits.
  • SCM_COMMITS_FILES: number of files associated to commits.
  • SCM_COMMITTERS: number of distinct committers.
  • SCM_FIXES: number of fix-related commits, i.e. commits that include either the fix, issue, problem or error keywords in their message.

Metrics are retrieved for the overall time, the last week, last month, and last three months.

Variable names are:

  • SCM_COMMITS_1W SCM_COMMITS_1M SCM_COMMITS_3M SCM_COMMITS_TOTAL
  • SCM_COMMITS_FILES_1W SCM_COMMITS_FILES_1M SCM_COMMITS_FILES_3M SCM_COMMITS_FILES_TOTAL
  • SCM_COMMITTERS_1W SCM_COMMITTERS_1M SCM_COMMITTERS_3M SCM_COMMITTERS_TOTAL
  • SCM_FIXES_1W SCM_FIXES_1M SCM_FIXES_3M SCM_FIXES_TOTAL

File level

We retrieve the following metrics on file artefacts:

  • SCM_COMMITS: number of commits for the artefact.
  • SCM_COMMITTERS: number of distinct committers for the artefact.
  • SCM_FIXES: number of fix-related commits for the artefact, i.e. commits that include either the fix, issue, problem or error keywords in their message.

Metrics are retrieved for the overall time, the last week, last month, and last three months.

Variable names are:

  • SCM_COMMITS_1W SCM_COMMITS_1M SCM_COMMITS_3M SCM_COMMITS_TOTAL
  • SCM_COMMITTERS_1W SCM_COMMITTERS_1M SCM_COMMITTERS_3M SCM_COMMITTERS_TOTAL
  • SCM_FIXES_1W SCM_FIXES_1M SCM_FIXES_3M SCM_FIXES_TOTAL


Communication metrics

Communication metrics show an unusual part of the project: people’s activity and interactions during the elaboration of the product. Most software projects have two communication media: one targeted at the internal development of the product, for developers who actively contribute to the project by committing in the source repository, testing the product, or finding bugs (a.k.a. developers mailing list); and one targeted at end-users for general help and good use of the product (a.k.a. user mailing list).

The type of media varies across the different forges or projects: most of the time mailing lists are used, with a web interface like MHonArc or mod_mbox. In some cases, projects may use as well forums (especially for user-oriented communication) or NNTP news servers, as for the Eclipse foundation projects. The variety of media and tools makes it difficult to be extensive; however data providers can be written to map these to the common mbox format. We wrote connectors for mboxes, MHonArc, GMane and FUDForum (used by Eclipse).

We retrieve the following metrics on application artefacts:

  • The number of posts (COM_DEV_VOL, COM_USR_VOL) is the total number of mails posted on the mailing list during the considered period of time. All posts are counted, regardless of their depth (i.e. new posts or answers).
  • The number of distinct authors (COM_DEV_AUTH, COM_USR_AUTH) is the number of people having posted at least once on the mailing list during the considered period of time. Authors are counted once even if they posted multiple times, based on their email address.
  • The number of threads (COM_DEV_SUBJ, COM_USR_SUBJ) is the number of diffent subjects (i.e. a question and its responses) that have been posted on the mailing list during the considered period of time. Subjects that are replies to other subjects are not counted, even if the subject text is different.
  • The number of answers (COM_DEV_RESP_VOL, COM_USR_RESP_VOL) is the total number of replies to requests on the user mailing list during the considered period of time. A message is considered as an answer if it is using the Reply-to header field. The number of answers is often associated to the number of threads to compute the useful response ratio metric.
  • The median time to first reply (COM_DEV_RESP_TIME_MED, COM_USR_RESP_TIME_MED) is the number of seconds between a question (first post of a thread) and the first response (second post of a thread) on the mailing list during the considered period of time.

As for configuration management metrics, we worked on temporal measures to produce measures for the last week, last month, and last three months. Communication metrics are only available at the application level.

  • COM_DEV_AUTH_1M, COM_DEV_AUTH_3M, COM_DEV_AUTH_1W,
  • COM_DEV_RESP_TIME_MED_1M, COM_DEV_RESP_TIME_MED_3M, COM_DEV_RESP_TIME_MED_1W,
  • COM_DEV_RESP_VOL_1M, COM_DEV_RESP_VOL_3M, COM_DEV_RESP_VOL_1W,
  • COM_DEV_SUBJ_1M, COM_DEV_SUBJ_3M, COM_DEV_SUBJ_1W,
  • COM_DEV_VOL_1M, COM_DEV_VOL_3M, COM_DEV_VOL_1W,
  • COM_USR_AUTH_1M, COM_USR_AUTH_3M, COM_USR_AUTH_1W,
  • COM_USR_RESP_TIME_MED_1M, COM_USR_RESP_TIME_MED_3M, COM_USR_RESP_TIME_MED_1W,
  • COM_USR_RESP_VOL_1M, COM_USR_RESP_VOL_3M, COM_USR_RESP_VOL_1W,
  • COM_USR_SUBJ_1M, COM_USR_SUBJ_3M, COM_USR_SUBJ_1W,
  • COM_USR_VOL_1M, COM_USR_VOL_3M, COM_USR_VOL_1W,
Personal tools