Write a Blog >>
MSR 2018
Mon 28 - Tue 29 May 2018 Gothenburg, Sweden
co-located with * ICSE 2018 *
Tue 29 May 2018 12:08 - 12:15 at E4 room - Machine Learning for SE Chair(s): Alexander Serebrenik

The use of natural language processing (NLP) is gaining popularity in software engineering. In order to correctly perform NLP, we must pre-process the textual information to separate natural language from other information, such as log messages, that are often part of the communication in software engineering. We present a simple approach for classifying whether some textual input is natural language or not. Although our NLoN package relies on only 11 language features and character tri-grams, we are able to achieve an area under the ROC curve performances between 0.976-0.987 on three different data sources, with Lasso regression from Glmnet as our learner and two human raters for providing ground truth. Cross-source prediction performance is lower and has more fluctuation with top ROC performances from 0.913 to 0.980. Compared with prior work, our approach offers similar performance but is considerably more lightweight, making it easier to apply in software engineering text mining pipelines. Our source code and data are provided as an R-package for further improvements.

Tue 29 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30
Machine Learning for SETechnical Papers at E4 room
Chair(s): Alexander SerebrenikEindhoven University of Technology
11:00
17m
Full-paper
Data-Driven Search-based Software Engineering
Technical Papers
A: Vivek Nair, A: Amritanshu AgrawalNorth Carolina State University, A: Jianfeng Chen , A: Wei Fu, A: George Mathew, A: Tim MenziesNorth Carolina State University, A: Leandro Minku , A: Markus Wagner , A: Zhe Yu
11:17
17m
Full-paper
The Open-Closed Principle of Modern Machine Learning Frameworks
Technical Papers
A: Houssem Ben Braiek , A: Foutse KhomhPolytechnique Montréal, A: Bram AdamsMCIS, École Polytechnique de Montréal
Pre-print
11:34
17m
Full-paper
A Benchmark Study on Sentiment Analysis for Software Engineering Research
Technical Papers
A: Nicole NovielliUniversity of Bari, A: Daniela Girardi, A: Filippo LanubileUniversity of Bari
DOI Pre-print
11:51
17m
Full-paper
A Deep Learning Approach to Identifying Source Code in Images and Video
Technical Papers
A: Jordan Ott , A: Abigail AtchisonChapman University, A: Paul Harnack , A: Adrienne Bergh , A: Erik LinsteadChapman University
DOI Pre-print
12:08
7m
Short-paper
Natural Language or Not (NLoN) - package for Software Engineering Text Analysis Pipeline
Technical Papers
A: Mika MäntyläUniversity of Oulu, A: Fabio CalefatoUniversity of Bari, A: Maëlick Claes
Pre-print
12:15
15m
Other
Discussion phase
Technical Papers