CROP: Linking Code Reviews to Source Code Changes
Code review has been widely adopted by both industrial and open source software development communities. Research in code review is highly dependant on real-world data, and although existing researchers have attempted to provide code review datasets, there is still no dataset that links code reviews with complete versions of the system’s code base mainly because reviewed versions are not kept in the system’s version control repository. Thus, we present CROP, the Code Review Open Platform, the first curated code review repository that links review data with isolated complete versions (snapshots) of the source code at the time of review. CROP currently provides data for 8 software systems, 48,975 reviews and 112,617 patches, including versions of the systems that are inaccessible in the systems’ original repositories. Moreover, CROP is extensible, and it will be continuously curated and extended.
Tue 29 May Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 11:06 Short-paper | 50K-C: A dataset of compilable, and compiled, Java projects Data Showcase A: Pedro MartinsUniversity of California at Irvine, USA, A: Crista LopesUniversity of California Irvine, A: Rohan Achar | ||
11:06 - 11:12 Short-paper | JBench: A Dataset of Data Races for Concurrency Testing Data Showcase A: Jian GaoSchool of Software, Tsinghua University, A: Xin Yang , A: Yu Jiang, A: Han Liu, A: Weiliang Ying , A: Xian Zhang | ||
11:12 - 11:18 Short-paper | Bugs.jar: A Large-scale, Diverse Dataset of Real-world Java Bugs Data Showcase A: Ripon Saha, A: Yingjun LyuUniversity of Southern California, A: Wing LamUniversity of Illinois at Urbana-Champaign, A: Hiroaki YoshidaFujitsu Laboratories of America, Inc., A: Mukul PrasadFujitsu Laboratories of America | ||
11:18 - 11:24 Short-paper | A Gold Standard for Emotion Annotation in Stack Overflow Data Showcase A: Nicole NovielliUniversity of Bari, A: Fabio CalefatoUniversity of Bari, A: Filippo LanubileUniversity of Bari Pre-print | ||
11:24 - 11:30 Short-paper | Vulinoss: A Dataset of Security Vulnerabilities in Open-source Systems Data Showcase A: Antonios Gkortzis Athens University of Economics and Business, A: Dimitris Mitropoulos, A: Diomidis SpinellisAthens University of Economics and Business Pre-print | ||
11:30 - 11:36 Short-paper | A Dataset of Duplicate Pull-requests in GitHub Data Showcase A: Zhixing Li College of Computer, National University of Defense Technology, Changsha, China, A: Yue Yu National University of Defense Technology, A: Gang YinNational University of Defense Technology, A: Tao WangNational University of Defense Technology, A: Huaimin Wang Pre-print | ||
11:36 - 11:42 Short-paper | Structured Information on State and Evolution of Dockerfiles on GitHub Data Showcase DOI Pre-print | ||
11:42 - 11:48 Short-paper | A Graph-based Dataset of Commit History of Real-World Android apps Data Showcase A: Franz-Xaver Geiger , A: Ivano MalavoltaVrije Universiteit Amsterdam, A: Luca PascarellaDelft University of Technology, A: Fabio Palomba, A: Dario Di NucciVrije Universiteit Brussel, A: Alberto BacchelliUniversity of Zurich DOI Pre-print | ||
11:48 - 11:54 Short-paper | Public Git Archive: a Big Code dataset for all Data Showcase DOI Pre-print | ||
11:54 - 12:00 Short-paper | Word Embeddings for the Software Engineering Domain Data Showcase A: Vasiliki EfstathiouAthens University of Economics and Business, A: Christos Chatzilenas , A: Diomidis SpinellisAthens University of Economics and Business DOI Pre-print | ||
12:00 - 12:06 Short-paper | npm-miner: An Infrastructure for Measuring the Quality of the npm Registry Data Showcase A: Kyriakos Chatzidimitriou Aristotle University of Thessaloniki, A: Michail Papamichail , A: Themistoklis DiamantopoulosElectrical and Computer Engineering Dept, Aristotle University of Thessaloniki, A: Michail Tsapanos , A: Andreas Symeonidis DOI Pre-print | ||
12:06 - 12:12 Short-paper | CROP: Linking Code Reviews to Source Code Changes Data Showcase A: Matheus PaixaoUniversity College London, A: Jens KrinkeUniversity College London, A: DongGyun HanUniversity College London, A: Mark HarmanFacebook and University College London DOI Pre-print | ||
12:12 - 12:18 Short-paper | Developer Interaction Traces backed by IDE Screen Recordings from Think-aloud Sessions Data Showcase A: Aiko YamashitaOslo Metropolitan University, A: Fabio PetrilloConcordia University, A: Foutse KhomhPolytechnique Montréal, A: Yann-Gaël GuéhéneucConcordia University and Polytechnique Montréal Pre-print | ||
12:18 - 12:24 Short-paper | A Multi-level Dataset of Linux Kernel Patchwork Data Showcase DOI Pre-print | ||
12:24 - 12:30 Short-paper | Documented Unix Facilities Over 48 Years Data Showcase Link to publication DOI Media Attached |