Data To Mine

From Maisqual Private Wiki

Jump to: navigation, search

This page summarises the data that can be mined to gather informations on the development process.


[edit] Intent

We want to gather information on a software project and monitor its evolution. At the moment, we only target releases: from one release to the next, identify what practices and attributes of quality have changed.

From the data flow point of view, we have the following scheme:

Project 1.0:
  • Practices
    • P1(t1.0)
    • P2(t1.0)
    • P3(t1.0)
    • P4(t1.0)
  • Product Quality Attributes
    • QPd1(t1.0)
    • QPd2(t1.0)
    • QPd3(t1.0)
    • QPd4(t1.0)
  • Process Quality Attributes
    • QPc1(t1.0)
    • QPc2(t1.0)
    • QPc3(t1.0)
    • QPc4(t1.0)
  • Charisma Quality Attributes
    • QC1(t1.0)
    • QC2(t1.0)
    • QC3(t1.0)
    • QC4(t1.0)
Project 1.1:
  • Practices
    • P1(t1.1)
    • P2(t1.1)
    • P3(t1.1)
    • P4(t1.1)
  • Product Quality Attributes
    • QPd1(t1.1)
    • QPd2(t1.1)
    • QPd3(t1.1)
    • QPd4(t1.1)
  • Process Quality Attributes
    • QPc1(t1.1)
    • QPc2(t1.1)
    • QPc3(t1.1)
    • QPc4(t1.1)
  • Charisma Quality Attributes
    • QC1(t1.1)
    • QC2(t1.1)
    • QC3(t1.1)
    • QC4(t1.1)
Project 1.2:
  • Practices
    • P1(t1.2)
    • P2(t1.2)
    • P3(t1.2)
    • P4(t1.2)
  • Product Quality Attributes
    • QPd1(t1.2)
    • QPd2(t1.2)
    • QPd3(t1.2)
    • QPd4(t1.2)
  • Process Quality Attributes
    • QPc1(t1.2)
    • QPc2(t1.2)
    • QPc3(t1.2)
    • QPc4(t1.2)
  • Charisma Quality Attributes
    • QC1(t1.2)
    • QC2(t1.2)
    • QC3(t1.2)
    • QC4(t1.2)

From there, we distinguish two types of data mining: practices identification and overall quality of the project.

[edit] Practices identification

We know little about this for now.

Identifying practices. This is mainly achieved through

  • Mining patterns in the project's history (defects, commits, mailing lists records..).
  • Mining patterns in the source code (when tests were setup, refactoring, etc.).
  • Surveys, that often bring information that would difficult, if not impossible, to mine in the repositories.

[edit] Quality Evaluation

We want to establish pragmatic measurement of the quality attributes defined in the Maisqual Quality Model.

Maisqual Quality Model architecture.png

[edit] Product Quality

Data mining on product has been put in a separate page: Data_to_Mine: Product.

[edit] Process Performance

Data mining on process has been put in a separate page: Data_to_Mine: Process.

[edit] Charisma

Data mining on charisma attributes may be the following:


  • Enhancements Integration.
  • Bug Tracking responsiveness.
  • Average time of response on questions.


  • Number of mails/posts/etc. exchanged.
  • Number of downloads.
  • Is there a user communication medium?
  • Average time to answer to mail on user media.
  • Number of mails exchanged on user media.


  • Publications: magazines, conferences, citations.
  • Votes on Freshmeat, ohloh, etc.
  • Number of results from search engine.
  • Number of links to the project web site.
  • Number of distinct registered users.

[edit] See also

On this wiki:


Personal tools