MSR 2009: 6th IEEE Working Conference on Mining Software Repositories

Call for Papers (PDF) (Updated 20081107)

May 16-17, 2009

Co-located with ICSE 2009,
IEEE International Conference on Software Engineering

General Chair

Katsuro Inoue
University of Osaka, Japan

Program Co-chairs

Michael W. Godfrey
University of Waterloo, Canada

Jim Whitehead
University of California, Santa Cruz

Challenge Chair

Christian Bird
UC Davis, USA

Program Committee

Giuliano Antoniol
(École Polytechnique de Montréal, Canada)
Andrew Begel
(Microsoft Corp., USA)
Geoff Clemm
(IBM / Rational, USA)
Stephan Diehl
(U. of Trier, Germany)
Massimiliano di Penta
(U. of Sannio, Italy)
Harald Gall
(U. of Zurich, Switzerland)
Daniel Germán
(U. of Victoria, Canada)
Tudor Girba
(U. of Bern, Switzerland)
Jesus Gonzalez-Barahona
(U. Rey Juan Carlos, Spain)
Ahmed Hassan
(Queen's U., Canada)
Ric Holt
(U. of Waterloo, Canada)
Huzefa Kagdi
(Missouri U. of Science and Technology, USA)
Tohru Kikuno
(Osaka U., Japan)
Miryung Kim
(U. of Texas (Austin), USA)
Sung Kim
(Hong Kong University of Science and Technology, China)
Michele Lanza
(U. of Lugano, Switzerland)
Ben Liblit
(U. of Wisconsin (Madison), USA)
Jonathan Maletic
(Kent State U., USA)
Andrian Marcus
(Wayne State U., USA)
Audris Mockus
(Avaya Labs, USA)
Nachiappan Nagappan
(Microsoft Corp., USA)
Dewayne Perry
(U. of Texas (Austin), USA)
Martin Pinzger
(U. of Zurich, Switzerland)
Lori Pollock
(U. of Delaware, USA)
Martin Robillard
(McGill U., Canada)
Gregorio Robles
(U. Rey Juan Carlos, Spain)
Jelber Sayyad Shirabad
(Ottawa U., Canada)
Walt Scacchi
(U. of California (Irvine), USA)
Tao Xie
(North Caroline State U., USA)
Andy Zaidman
(Delft Technical U., Netherlands)
Thomas Zimmermann
(U. of Calgary, Canada)

Poster Chair

Miryung Kim
Univerity of Texas (Austin), USA

Local Arrangement Chair

Dirk Beyer
Simon Fraser University, Canada

Web Chair

Abram Hindle
University of Waterloo, Canada


Co-located with ICSE 2009,
Vancouver, Canada

Steering Committee

Ahmed E. Hassan
Queen's University, Canada
Audris Mockus
Avaya, USA
Ric Holt
University of Waterloo, Canada
Katsuro Inoue
Osaka University, Japan
Stephan Diehl
University Trier, Germany
Harald Gall
University of Zurich, Switzerland
Michele Lanza
University of Lugano, Switzerland

MSR 2010 is up and running! Check it out! Submit papers!

Check out these great MSR Caricatures!

Conference banquet dinner details are right here!

Be sure to check out the IBM Jazz Research Reception

The MSR Programme has been updated!

Keynote speaker biographies and abstracts updated!

Second Keynote talk and speaker announced!

Challenge Report Deadline extended to Monday March 16th!

Keynote talk and speaker announced!


Software repositories such as source control systems, archived communications between project personnel, and defect tracking systems are used to help manage the progress of software projects. Software practitioners and researchers are recognizing the benefits of mining this information to support the maintenance of software systems, improve software design/reuse, and empirically validate novel ideas and techniques. Research is now proceeding to uncover the ways in which mining these repositories can help to understand software development and software evolution, to support predictions about software development, and to exploit this knowledge concretely in planning future development.

The goal of this two-day working conference is to advance the science and practice of software engineering via the analysis of data stored in software repositories.

We solicit poster papers (4 pages) and research papers (10 pages). Poster papers should discuss controversial issues in the field, or describe interesting or thought provoking ideas that are not yet fully developed. Accepted poster papers will present their ideas in poster form during a poster session at the conference, and in a short lightning talk. Full research papers are expected to describe new research results, and have a higher degree of technical rigor than poster papers. Accepted full papers will present their ideas in a research talk at the conference. Paper submissions must be formatted according to ICSE guidelines. A selection of the best research papers will be invited for consideration in a special issue of the Springer journal Empirical Software Engineering.


Papers may address issues along the general themes, including but not limited to the following:

  • Models for social and development processes that occur in large software projects
  • Prediction of future software qualities via analysis of software repositories
  • Models of software project evolution based on historical repository data
  • Prediction, characterization, and classification of software defects based on analysis of software repositories
  • Techniques to model reliability and defect occurrences
  • Search-based software engineering, including search techniques to assist developers in finding suitable components and code fragments for reuse, and software search engines
  • Analysis of change patterns to assist in future development
  • Visualization techniques and models of mined data
  • Techniques, tools, and interchange formats for capturing new forms of data for storage in software repositories, such as effort data, fine-grain changes, and refactoring
  • Approaches, applications, and tools for software repository mining
  • Quality aspects and guidelines to ensure quality results in mining
  • Meta-models, exchange formats, and infrastructure tools to facilitate the sharing of extracted data and to encourage reuse and repeatability
  • Case studies on extracting data from repositories of large long-lived projects
  • Methods of integrating mined data from various historical sources

MSR Challenge

MSR Challenge. We invite researchers to demonstrate the usefulness of their mining tools on the source code repositories, bug data, and mailing list archives of the GNOME desktop suite by participating in the two MSR Challenge tracks:
  1. General. Discover interesting facts about the history of the GNOME desktop suite. Results should be reported as 4-page submissions, to be included in the proceedings as challenge papers.
  2. Prediction. We challenge you to predict the code growth in core GNOME projects in terms of source lines of code from February 1st to April 30th, 2009. You can provide 1-page long descriptions of the rationale behind your prediction. Wild guesses are also welcome and will put "real" miners under pressure.
The winners of both tracks will receive an award. Click here for a more detailed description of the challenge.

Important Dates

Abstracts (research/poster papers) 3 March 2009
Submission (research/poster papers) 7 March 2009
Challenge #1: Submission of reports 16 March 2009
Challenge #2: Submission of predictions 7 February 2009
Notifications sent to authors 3 April 2009
Camera-ready copy (papers / challenge #1 reports) 22 April 2009
Conference date 16-17 May 2009

Click here to submit MSR 2009 abstracts, papers, and poster.
Click here to submit MSR 2009 challenge submissions.
Papers must be in the IEEE CS proceedings style - Two Column Format.

The proceedings for MSR-09 will be published electronically by the IEEE digital library. Additionally, attendees of MSR-09 will be able to download at electronic version during the conference.

MSR 2009 Keynote Presentations

We are proud to announce that we have excellent keynote presentations lined up for MSR 2009.

MSR 2009 First Keynote Presentation

Success Factors of Business Intelligence
Michael McAllister
Director of academic research centres (ARC) at SAP Business Objects


Business Intelligence (BI) has proven to be a competitive advantage for organizations, allowing them to better measure, manage, and optimize their operations. It has provided the means to improve data-driven decision making and to harmonize an organization's strategy with its everyday operations. The early success of BI arose in providing a semantic-level access to heterogeneous data sources beyond organizations' information technology departments. Retrospective and predictive analytical components have since increased the value of BI to the organizations. In this talk, we will discuss success factors and influences for BI that have arisen by making information available across an organization and will open a discussion on some of the near-term and long-term BI challenges.


Mike McAllister is director of academic research centres (ARC) at SAP Business Objects, where he is responsible for creating, leading, managing and contributing to research partnerships and projects with academia across North America on topics related to BI. He completed his PhD in Computer Science at UBC in 1999, then joined Dalhousie University where he is an Associate Professor. He was also Associate Dean for Computer Science in 2007-2008 before taking a leave to join SAP to accelerate the company's investment in research with academia.

MSR 2009 Second Keynote Presentation

This keynote will be shared with ICPC 2009 and occur on the Sunday morning of MSR.

A Brief History of Software - from Bell Labs to Microsoft Research
Thomas Ball
Principal Researcher, Microsoft Research


In the mid 1990s, I was (tangentially) part of an effort in Bell Labs called the "Code Decay" project. The hypothesis of this project was that over time code becomes fragile (more difficult to change without introducing problems), and that this process of decay could be empirically validated. This effort awakened me to the power of combining statistical expertise with software engineering expertise to address pressing problems of software production in a statistically valid manner. I will revisit some of the work we did in the Code Decay project at Bell Labs and then turn to what has been happening in this area in Microsoft in the last five years. In particular, I will trace how we have progressed from studying the data produced by product teams to validate hypotheses, to being actively involved with the product groups in creating and evaluating new tools and techniques for empirically-based software production.


Thomas Ball is Principal Researcher at Microsoft Research where he manages the Software Reliability Research group. Tom received a Ph.D. from the University of Wisconsin-Madison in 1993, was with Bell Labs from 1993-1999, and has been at Microsoft Research since 1999. He is one of the originators of the SLAM project, a software model checking engine for C that forms the basis of the Static Driver Verifier tool. Tom's interests range from program analysis, model checking, testing and automated theorem proving to the problems of defining and measuring software quality.
Nedstat Basic - Free web site statistics