dc.description.abstract |
An important issue faced during software development is to identify defects and the properties of those
defects, if found, in a given source file. Determining defectiveness of source code assumes significance due to
its implications on software development and maintenance cost.
We present a novel system to estimate the presence of defects in source code and detect attributes of the
possible defects, such as the severity of defects. The salient elements of our system are: (i) a dataset of newly
introduced source code metrics, called PROgramming CONstruct (PROCON) metrics, and (ii) a novel MachineLearning (ML)-based system, called Defect Estimator for Source Code (DESCo), that makes use of PROCON
dataset for predicting defectiveness in a given scenario. The dataset was created by processing 30,400+ source
files written in four popular programming languages, viz., C, C++, Java, and Python.
The results of our experiments show that DESCo system outperforms one of the state-of-the-art methods
with an improvement of 44.9%. To verify the correctness of our system, we compared the performance of
12 different ML algorithms with 50+ different combinations of their key parameters. Our system achieves the
best results with SVM technique with a mean accuracy measure of 80.8%. |
en_US |