Live Data Retrieval Process

From Maisqual Private Wiki

Jump to: navigation, search

This page describes the process of data retrieval, extraction and treatment for the Live Project Analysis.

It should be used as a template for defining Jenkins projects for analysis.[1]

From an architectural point of view, we have one Perl script for each main step: check the Maisqual Perl Scripts for more information.


Contents

[edit] Process overview

This is to be executed weekly.

The whole process has been divided into the following parts:

Data retrieval:

  • Get raw data from data sources: SVN, Bugzilla, etc.
  • Build survey.

Project processing:

  • Project pre-processing: run any data provider: Checkstyle, PMD, etc.
  • Project processing: build project, run tests, run SQuORE analysis.
  • Project post-processing: extract and consolidate data issued from project processing.

Data processing:

  • Data pre-processing: gather, import, and format data.
  • Data processing: Run R scripts, plot data.

Data Publishing:

  • Data post-processing & Publishing.

Check the Maisqual Server article for more information about the different locations used on the file system.


[edit] Data Retrieval

Also known as Data Mining on Software Repositories.

We gather the following data:

  • Update SVN sources for trunk or branch,
  • Get Configuration Management metadata.
# COMMITS,COMMITERS,FILES_MODIFIED
13,3,48
# BUGS_OPEN,BUGS_WORKING,BUGS_CLOSED
13,10,41
  • Get BugZilla Reports: number of CRs opened on specific releases in all statuses.
# APP_VERSION,BUGS_OPEN,BUGS_WORKING,BUGS_CLOSED
Ant_1.1,13,10,41
Ant_1.2,10,19,37
Ant_1.3,16,11,25
  • Get Mail informations.

We also want to get some manually-entered data, relative to the build context. Check Project Release Survey.


[edit] Project Processing

[edit] Project pre-processing

On Java projects:

  • Run Checkstyle,
  • Run PMD/FindBug

[edit] Project processing

  • Build Project
  • Run tests
  • Run SQuORE Analysis

[edit] Project post-processing

  • Retrieve and format performances data:
    • Time and resources for the build,
# Time (secs),Max_Memory (Mo)
357,652
330,648
    • Time and resources for the tests.
  • Extract analysis data
    • Extract metrics from SQuORE
    • Time and resources for the SQuORE analysis


[edit] Data Processing

[edit] Data pre-processing

  • Import data from project post-processing

[edit] Data processing

Run R scripts to extract meaningful data.

[edit] Data post-processing

We aim to get the following charts for trends:

  • Bugs-related data:
    • Overall evolution of bug statuses
    • Per-version evolution of bug statuses
  • Build-related data:
    • Evolution of build time and resources
    • Evolution of the number of warnings
  • Test-related data:
    • Evolution of testing time
    • Evolution of failed tests/coverage (if available)
  • Product-related data:

We aim to get the following charts for snapshot data for the current build:

  • Display of metrics


[edit] Data publishing

We want these information to be (sorted by priorities):


[edit] References

  1. Following maisqual:12 Steps to Useful Software Metrics we automated the whole process of data processing.
Personal tools