MSR Challenge Data
This page contains data that has been mined from the gnome projects and should be helpful for researchers planning on submitting to the challenge track. Feel free to use this for other academic use as well. This is by no means the only source of data. Researchers are encouraged to mine any other data regarding these projects that you can find. If you would like to contribute data to this page or you have problems accessing or understanding the data, please contact me (Christian Bird, cabird AT ucdavis DOT edu).
Repository Logs
Easily parsable xml logs of over the 500+ gnome projects hosted at http://svn.gnome.org/svn/ can be downloaded at:
http://2009.msrconf.org/challenge/gnome_data/gnome_svn_xmllogs.tar.bz2
Bug Database
An XML dump of the gnome bug database found at http://bugzilla.gnome.org/ can be downloaded at:
http://2009.msrconf.org/challenge/gnome_data/gnome_bugzilla.xml.bz2
Mirrored Repositories
The complete svn repositories for the 69 Prediction Challenge projects have been downloaded in their entirety. These are all svn repositories that can be untarred on a local machine and accessed with standard svn clients.
Repository Statistics
The following data is available for the 69 projects used in the Prediction Challenge. We gratefully acknowledge the help of the FLOSSMetrics project at http://www.flossmetrics.org for the use of this data. Due to problems with email harvesters and spammers, the we are not posting the mbox archives of the available mailing lists. However, we are working on a method for making these archives available to academic researchers. Please contact me (cabird AT ucdavis DOT edu) if you would like the actual mbox archives of the projects and we will get them to you privately.
Links to the sources of the raw data can be found on the status page for each project. Links to the GNOME hosted mailing lists and their archives in mbox format can be found on the status pages as well.
Regarding the database schemas, there are a couple of documents that can be useful for the participants. The first one is the database specification [1]. That document contains the schemas of the databases (with all the relations, etc), and explanations about the meaning of each one of the fields of each table. From the different databases described, only the SVN (CVSAnalY) and the mailing lists archives (MLStats) are relevant for participants. The second document contains a brief description of the database (and of its status as of December 2007), and of the website that contains the data.
http://flossmetrics.org/sections/deliverables/docs/deliverables/WP3/D3.1-Database_Specification.pdf
http://flossmetrics.org/sections/deliverables/docs/deliverables/WP3/D3.2-Database.pdf
Project name |
Status page |
Database dumps page |
Latest SVN db dump |
Latest SVN log file |
Latest Mailing List Stats db dump |
alacarte |
|
||||
bug-buddy |
|
||||
deskbar-applet |
|||||
eel |
|
||||
ekiga |
|||||
eog |
|||||
epiphany |
|||||
evince |
|||||
evolution |
|||||
evolution-data-server |
|
||||
evolution-exchange |
|
||||
evolution-webcal |
|
||||
fast-user-switch-applet |
|
||||
file-roller |
|
||||
gcalctool |
|
||||
gconf-editor |
|
||||
gdm |
|||||
gedit |
|||||
gnome-applets |
|
||||
gnome-backgrounds |
|
||||
gnome-control-center |
|||||
gnome-desktop |
|
||||
gnome-doc-utils |
|
||||
gnome-games |
|||||
gnome-icon-theme |
|
||||
gnome-keyring |
|
||||
gnome-keyring-manager |
|
||||
gnome-mag |
|
||||
gnome-media |
|||||
gnome-menus |
|
||||
gnome-netstatus |
|
||||
gnome-nettool |
|||||
gnome-panel |
|
||||
gnome-power-manager |
|||||
gnome-screensaver |
|||||
gnome-session |
|
||||
gnome-speech |
|
||||
gnome-system-monitor |
|
||||
gnome-system-tools |
|||||
gnome-terminal |
|
||||
gnome-themes |
|||||
gnome-user-docs |
|
||||
gnome-utils |
|||||
gnome-volume-manager |
|
||||
gok |
|
||||
gtk-engines |
|
||||
gtkhtml |
|
||||
gtksourceview |
|
||||
gucharmap |
|
||||
libgail-gnome |
|
||||
libgnomekbd |
|
||||
libgtop |
|
||||
liboobs |
|
||||
librsvg |
|
||||
libsoup |
|||||
libwnck |
|
||||
metacity |
|||||
nautilus |
|||||
nautilus-cd-burner |
|
||||
orca |
|||||
seahorse |
|||||
sound-juicer |
|
||||
tomboy |
|||||
totem |
|
||||
vino |
|
||||
vte |
|
||||
yelp |
|
||||
zenity |
|