Mining SVN

From Maisqual Private Wiki

Jump to: navigation, search

This article gives information about the information we extract from SVN.


Contents


[edit] SVN Log

SVN activity can be retrieved through the following command:

svn log -v --xml /path/to/repository

We typically call it on the trunk or branches:

svn log -v --xml http://svn.apache.org/repos/asf/ant/core/trunk > svn_log_trunk.xml
svn log -v --xml http://svn.apache.org/repos/asf/ant/core/branches/ANT_16_BRANCH > svn_log_1.6.xml


[edit] Commits

Then we use the parse_svn_commits_per_*.pl scripts to turn it into a more practical csv file.

The format of the csv file then depends on the script used. We are able to generate:

  • commits per month with parse_svn_commits_per_month.pl
#date,commits,files
2000-01-13,6,91
2000-01-14,2,12
2000-01-21,1,2
2000-01-23,3,11
  • commits per day with parse_svn_commits_per_day.pl
#date,commits,files
2000-01-01,33,173
2000-02-01,45,161
2000-03-01,19,55

These files can then be used in R for manipulations and plots.


[edit] Developers

The parse_svn_users_per_month.pl script creates a csv file with the number of distinct developers, based on the svn log information.

The CSV file looks like this:

#date,developers
2000-01-01,5
2000-02-01,4
2000-03-01,8
2000-04-01,4
Personal tools