Accepted Papers
Call for Papers
The International Conference on Mining Software Repositories (MSR) has hosted a mining challenge since 2006. With this challenge we call upon everyone interested to apply their tools to bring research and industry closer together by analyzing a common data set. The challenge is for researchers and practitioners to bravely use their mining tools and approaches on a dare.
This year, the challenge is on Enriched Event Streams, a public data set for empirical studies on in-IDE activities of software developers. The data set contains over 11M events that correspond to 15K hours of working time of 81 developers. We have collected the events using FeedBaG, a general-purpose interaction tracker for Visual Studio that is built as a plugin to Jetbrains ReSharper framework. FeedBaG captures all commands invoked in the IDE, together with additional context information, and stores them in an Enriched Event Stream that provides a holistic picture of the in-IDE development process.
Enriched Event Streams can help answer, for example, the following research questions:
- Which IDE commands do developer use?
- How are test cases executed?
- Does refactoring lead to more failed tests?
- How do developers navigate the code base?
- What kind of changes do developer revert?
How to Participate in the Challenge
- Familiarize yourself with the CARET plattform and the dataset.
- Study the preprint of our challenge proposal, which contains details about the data set.
- Study our challenge website to get to know the plattform and data set. Try out our tutorials.
- Check out our discussion of different application scenarios that we already used CARET for.
- Access and analyze the Enriched Event Streams data set (download the newest dataset).
- Use our mailing list to ask questions about the dataset and follow the KaVE project on Twitter for updates.
- Report your findings in a four-page document.
- Submit your report on or before February 5, 2018.
- If your report is accepted, present your results at MSR 2018!
Challenge Data
Our March 1, 2017 release contains 11M interaction events that have been uploaded by a diverse group of 81 developers (developers that contributed less than 2,500 events are already filtered out). Out of these developers, 43 come from industry, three are researchers, give are students, and six are hobby programmers. Twenty-four participants did not provide this (optional) information about their position. The data covers a total of 1,527 aggregated days and was collected over eleven month, byt not all developers participated the entire time. On average, each developer provided 136K events (median 54K) that have been collected over 10 days (media 18.9 days) and that represent 185 hours of active work (median 48 hours). In total the data set aggregates 15K hours of development work.
Enriched Event Streams provide detailed context information about code completion, test execution, and source-code evolution. The data set contains detailed data about 200K usages of the code completion, including a snapshot of the surrounding source code, as well as 3.6K test executions. An average user provided 2.5K usages of the code completion (median 640) and 44 test executions.
We provide an API for both Java and C# that allows reading the data and we created examples in both languages that help you get started. Technically, the data set stores a JSON representation of the collected events and can also be read and processed using other languages.
If you used the Enriched Event Streams data set, please cite our challenge proposal:
@inproceedings{msr18challenge,
title={Enriched Event Streams: A General Dataset for Empirical Studies on In-IDE Activities of Software Developers},
author={Proksch, Sebastian and Amann, Sven and Nadi, Sarah},
year={2018},
booktitle={Proceedings of the 15th Working Conference on Mining Software Repositories},
preprint={http://www.st.informatik.tu-darmstadt.de/artifacts/msr18-challenge/MSR-Challenge-Proposal.pdf}
}
Challenge Report
The challenge report should describe the results of your work by providing an introduction to the problem you address and why it is worth studying, the version of the data set you used, the approach and tools you used, your results and their implications, and conclusions. Make sure your report highlights the contributions and the importance of your work. We appreciate submissions that make reproducing their results easy, for example by providing (possibly external) replication instructions and open-sourcing additionally created tools.
Challenge reports must be at most 4 pages long and must conform at time of submission to the MSR 2018 Format and Submission Guidelines. Similar to the main track, the Challenge reports will undergo a light-weight double-blind review process. Therefore, the submitted paper must not reveal the authors’ identities. In particular, the names, organizations, and number of authors must not be present, and a reasonable effort should be made to blind externally available material. The identifying information may be re-added, in case of acceptance, in the camera-ready paper.
Submission
Submit your challenge report (maximum 4 pages) to EasyChair on or before February 5, 2018. Please submit your challenge reports to the “Mining Challenge Track”. Papers submitted for consideration should not have been published elsewhere and should not be under review or submitted for review elsewhere during the duration of consideration. ACM plagiarism policies and procedures shall be followed for cases of double submission.
Similar to the main track, the Challenge reports will undergo a double-blind review process. Therefore, the submitted paper must not reveal the authors’ identities. In particular, the names, organizations, and number of authors must not be present, and a reasonable effort should be made to blind externally available material. The identifying information can of course be re-added in case of acceptance in the camera ready paper. Submissions should follow ACM formatting guidelines and should be submitted using the EasyChair link.
Upon notification of acceptance, all authors of accepted papers will be asked to complete an ACM Copyright form and will receive further instructions for preparing their camera ready versions. At least one author of each paper is expected to present the results at the MSR 2018 conference. All accepted contributions will be published in the conference electronic proceedings.
Important Dates
Papers Due | 23:59 AOE, February 5, 2018 |
Author Notification | 23:59 AOE, March 2, 2018 |
Camera Ready | 23:59 AOE, March 16, 2018 |
Organization
Program Committee Chairs
- Sebastian Proksch, University of Zurich, Switzerland
- Sven Amann, Technische Universität Darmstadt, Germany
- Sarah Nadi, University of Alberta, Canada