Talk:Data To Mine
From Maisqual Private Wiki
We have established the following list of types of data that can be used for data mining in Software Engineering.
Data Mining on software repositories techniques are used for two purposes:
- Assessing the product and process quality through metrics. This is mainly achieved on:
- The source code for the product quality.
- The tools for the process quality.
- Identifying practices. This is mainly achieved through
- Mining patterns in the project's history (defects, commits, mailing lists records..).
- Mining patterns in the source code (when tests were setup, refactoring, etc.).
- Surveys, that often bring information that would difficult, if not impossible, to mine in the repositories.
 Product information
Product information includes:
- Source code
- Dynamic execution traces
All these have been gathered in Data to Mine: Product.
 Process information
Process information includes:
- Configuration Management
- Change Management
- Release Management
- Mailing Lists, Forums & Communication
All these have been gathered in Data to Mine: Process.
Check also the Project Release Survey for a template of a checklist to apply on every project release.
 User satisfaction
User satisfaction, which is one of the main quality criteria, is purposely put apart the product and process information, since it may come from both.
Community web sites holding surveys:
- popularity contests
- number of (pertinent) results in search engines