Abstract:
Code reviews are one of the effective methods to estimate defectiveness in source code. However, the existing methods are dependent
on experts or inefficient. In this paper, we improve the performance
(in terms of speed and memory usage) of our existing code review
assisting tool–CRUSO. The central idea of the approach is to estimate the defectiveness for an input source code by using the
defectiveness score of similar code fragments present in various
StackOverflow (SO) posts.
The significant contributions of our paper are i) SOpostsDB: a
dataset containing the PVA vectors and the SO posts information,
ii) CRUSO-P: a code review assisting system based on PVA models trained on SOpostsDB. For a given input source code, CRUSOP labels it as {Likely to be defective, Unlikely to be
defective, Unpredictable}. To develop CRUSO-P, we processed
>3 million SO posts and 188200+ GitHub source files. CRUSO-P is
designed to work with source code written in the popular programming languages {C, C#, Java, JavaScript, and Python}.
CRUSO-P outperforms CRUSO with an improvement of 97.82% in
response time and a storage reduction of 99.15%. CRUSO-P achieves
the highest mean accuracy score of 99.6% when tested with the C
programming language, thus achieving an improvement of 5.6%
over the existing method.