From Maisqual Private Wiki

Jump to: navigation, search

This is the monthly checkpoint for summarising work that have been accomplished recently. Please feel free to comment.


[edit] Timesheet

04/05/2011    Ant plugin update
05/05/2011    CruiseControl Setup
06/05/2011    CruiseControl Setup + RDV Christophe These
11/05/2011    Update Wiki + Organisation
12/05/2011    Update Wiki + Organisation + SQuORE CC XSL
13/05/2011    Update Wiki + Organisation
18/05/2011    Update Wiki : added metrics, papers categories, did some presentation stuff
19/05/2011    Readings + svn_tools
20/05/2011    Readings + svn_tools
24/05/2011    School inscription + Readings
25/05/2011    Discussed tools for Maisqual + School inscription
26/05/2011    Installed Squore for metrics + Getting measures + Update Wiki
27/05/2011    Requirements for SVN DataProdivers + Update wiki
31/05/2011    Requirements for API + Doc Introduction to Maisqual

[edit] Organisation

Communication is one of the key expectations of SQuORING. For that matter, a website maisqual.squoring.com has been set up with a public wiki available at maisqual.squoring.com/wiki.

The public wiki has the following categories defined:

A private wiki has also been set up, for all non-public matters: organisation, on-going research work, milestones. It is available, with authentication, at maisqual.squoring.com/privatewiki.

[edit] Readings

We have found many papers about "Data Mining Software Engineering Data".

The following articles have been read and summarised:

Other papers are following soon.

[edit] Tools and Projects

We have started to set up a process for metrics gathering on some aspects of software repositories: SVN repositories, source code (SQuORE). The projects selected so far are:

  • SQuORE trunk HEAD
  • The Linux Kernel,
  • Apache Ant.
  • CruiseControl

[edit] Research, data mining, first steps

The readings and discussions we had during this month have led us to the following:

  • We now know better what data to investigate for data mining, and what types of algorithms can be used for that. This has been summarised in the Data_To_Mine page.
  • The metrics we will start with are the most basic known: SLOC for size, McCabe/Halstead for complexity. Many studies show that these metrics are often the best (and most simple) bet.
  • Some tests have been made for retrieving informations/metrics from remote repositories (SVN). This has been put on the private wiki CustomTools:SVN_Analysis for records.
  • Some developments are required from SQuORING, namely:
    • A new Data Provider working on remote repositories instead of local directories. For now, the very same metrics that are computed by Squore will be enough. Specifications have been written for that Data Provider.
    • A new API to export Data. These are now stored in database, but are not easy to extract to a parseable output. Specifications have been written for that API.

[edit] Next Steps

[edit] Better defining the Thesis goals

The main goal of this project is about improvement of quality of processes and products.

  • For processes, this is achieved through:
    • Best practices investigation, e.g. "Peer Review is good for the reliability and we can prove it".
    • Good advice, e.g. "The shortest way from this state of quality to the next state I want is to refactor this module" or "Set up a continuous integration framework for better build stability".
    • Help for estimating work, e.g. this bug may take approx. 2 men/day to resolve, next iteration backlog should be completed in 2 weeks.
    • Help for management decisions, e.g. who this bug should be affected to, the target release won't be reached because of the number of bugs, etc.
  • For products, this is achieved through:
    • Good advice, e.g. add comments to this module for readability,
    • Proposing patterns for specific purposes: e.g. refactor this module with the Singleton design pattern,
    • Bug finding technics -- this is mainly for the reliability characteristic of quality, but it can helped a lot by data mining.

For both of these, the measured result should be the products quality: a good process leads to good products.

Metrics are probably the way to go for the quality assessment, and this means we have to know "What metrics is relevant for what quality characteristic (and we can prove it)".

[edit] What's next

Here are the items to be addressed in the near future:

  • Follow the data mining course of Philippe Preux.
  • Read more articles and papers, write summaries.
  • Collect data, and try to visualise them in R.
Personal tools