BugMiner: Mining Sourcecode Repositories for Bug Information

The revision history of a software system conveys important information about how and why the system evolved in time. The revision history can also tell us which parts of the system are coupled by common changes.

Inspired by a series of  research papers published by the Saarland University, Saarbrücken, Germany, I started working on a software tool last year that analyzes sourcecode repositories.

Features

This tool, named BugMiner for now, can currently answer the following questions:

  • Which sourcecode files, classes, methods, or functions are commonly changed at the same time?

Imagine two sourcecode files A and B that are almost always changed at the same time. If a developer now wants to check in changes for only one of these files, a warning (either by the IDE or the SCM) might be useful.

  • Which sourcecode files, classes, methods, or functions are the most bug-prone ones?

This information helps, possibly correlated with other software metrics such as cyclomatic complexity, to identify areas of the code base that require more tests.

Implementation

The tool is developed in PHP. It features a modular architecture that allows for different SCM backends (currently only Subversion is supported) and parser backends (currently only PHP is supported).

Subversion Repository

svn co svn://svn.phpunit.de/phpunit/bugminer/trunk bugminer

Browse this repository in Trac or  FishEye.