STAKE labSofTware And Knowledge Engineering lab


The main mission of the SofTware And Knowledge Engineering (STAKE) Lab is advancing the state of the art in software engineering and knowledge engineering, to improve users' productivity and develop new methods and tools for supporting decision makers. The current range of research and development topics pursued at Stake Lab includes (but are not limited to) the following:

decision support systems in the context of development and evolution of large software systems;
machine learning based applications in healthcare systems;
image analysis and understanding;
AR/VR based interactive environments;
blockchain and distributed ledger technology to support industrial processes.

People

People

Faculty

FOUNDER

Complete name

Prof. Rocco Oliveto

Academic Role

Full Professor

Personal Homepage

List of publications

dblp

Short Bio

Rocco Oliveto is Full Professor at University of Molise (Italy), where he is also the Chair of the Computer Science Bachelor and Master programs. He co-authored over 150 papers on topics related to software traceability, software maintenance and evolution, search-based software engineering, and empirical software engineering. His activities span various international software engineering research communities. He has served as organizing and program committee member of several international conferences in the field of software engineering. He was program co-chair of ICPC 2015, TEFSE 2015 and 2009, SCAM 2014, WCRE 2013 and 2012. He is also one of the co-founders and CEO of datasound, a spin-off of the University of Molise aiming at efficiently exploiting the priceless heritage that can be extracted from big data analysis.

CO-DIRECTOR

Complete name

Prof. Remo Pareschi

Academic role

Associate Professor

Personal Homepage

List of publications

dblp

Short Bio

Remo Pareschi is Associate Professore of Computer Science (tenured 2009), Department of Bioscience and Territory, University of Molise (Italy), with lecturing activities on Artificial Intelligence, Database Management Systems, Web Intelligence and IT Management. The current research focus is on the definition and implementation of a computational framework that, by combining symbolic and statistical approaches from artificial intelligence and from natural language processing, allows the execution of complex tasks in the domains of the coordination and integration of diverse content/knowledge sources, of social computing and of computer-supported creativity. This research program evolves and complements the one carried out during the late eighties and the nineties, aimed at using formal logic as the basis for a declarative approach to natural understanding and to the programming of distributed systems.

CO-DIRECTOR

Complete name

Prof. Stefano Ricciardi

Academic Role

Assistant Professor

Personal Homepage

List of publications

dblp

Short Bio

Stefano Ricciardi received the BSc in Computer Science, the MSc degree in Informatics and the PhD degree from the University of Salerno. He has been co-founder/owner of a videogame development team focused on 3D sports simulations. He is currently an Assistant Professor at the Department of Biosciences of the University of Molise. His main research interests include biometry, virtual and augmented reality, haptics systems and human-computer interaction. He is member of IEEE and GIRPR/IAPR having authored/co-authored about seventy research papers including international journals, book chapters and conference proceedings. He serves as external expert for the Research Executive Agency of the European Commission.

MEMBER

Complete name

Prof. Simone Scalabrino

Academic Role

Research Fellow

Personal Homepage

List of publications

dblp

Short Bio

Simone Scalabrino is a research fellow at the University of Molise, Italy. He received (magna cum laude) a Master's Degree in Computer Science from the University of Salerno (Italy) in 2015, advised by Andrea De Lucia, and the Ph.D. degree in Computer Science from the University of Molise in 2019, advised by Rocco Oliveto. His research interests include software quality, testing and security. He received three ACM SIGSOFT Distinguished Paper awards, at ICPC 2016, ASE 2017, and MSR 2019. He served as Program Co-Chair of the ERA-Track at ICPC 2022, Co-Organizer of the AeSIR Workshop (co-located with ASE 2021), and as Local Arrangement Co-Chair for SANER 2018. He has served as a program committee member for several Software Engineering conferences such as ASE 2021 and ICPC 2022, and he regularly serves as a reviewer for several international journals (e.g., TSE, TOSEM, EMSE, JSS). He is co-founder and CSO of datasound, a spin-off of the University of Molise.

Post-Doc

Complete name

Dr. Gennaro Laudato

Main Research Interests

Health Informatics, Biometrics, Machine Learning, DSP

Personal Homepage

List of publications

dblp

Short Bio

Gennaro Laudato is currently a Post Doctoral Researcher at the University of Molise and an IT specialist at INPS. He received a PhD (with Excellent mark) in Computer Science and Data Science from the University of Molise (Italy) in April 2021 advised by Prof. Rocco Oliveto. He also received (magna cum laude) a Master's Degree in Electronic Engineering for Automation and Telecommunications from the University of Sannio (Italy) in 2017 advised by Prof. Massimiliano di Penta. He finally received a Bachelor's Degree in Computer Science Engineering from the University of Sannio in 2013 advised by Prof. Luca De Vito. For two years he worked as QA Engineer at Octo Telematics in Rome. After, he became a Computer Science professor for the High Schools. He was the recipient of a best paper award at HEALTHINF 2020. He has served as a reviewer for the journal Measurement (Elsevier) and for the IEEE I2MTC and USBEREIT conferences.

Ph.D. Students

Complete name

Valentina Piantadosi

Thesis Topic

Code Readability

Personal Homepage

List of publications

dblp

Short Bio

Valentina Piantadosi was born in Isernia (Italy) on November 1st, 1993. She received (magna cum laude) a Master's Degree in Software System Security from the University of Molise (Italy) in 2018 defending a thesis on Software Reliability and Testing, advised by Prof. Rocco Oliveto. She received a Bachelor's Degree in Computer Science from the University of Molise in 2016 defending a thesis on Software Refactoring, advised by Prof. Rocco Oliveto. She is currently a Ph.D. Student. at the Department of Bioscences and Territory of University of Molise, advised by Prof. Rocco Oliveto. Her research interests include Vulnerability Detection, Testing and Machine Learning.

Complete name

Giovanni Rosa

Thesis Topic

Continuous Integration and Continuous Deployment

Personal Homepage

List of publications

dblp

Short Bio

Giovanni Rosa received the bachelor’s degree in Computer Science from the University of Molise, defending a thesis entitled “Are Developers Good in Code Review?” advised by Prof. Rocco Oliveto and co-advised by Prof. Jens Krinke, from the University College of London. Afterwards, he received the master’s degree in Software Systems Security also from the University of Molise defending a thesis entitled “Evaluating SZZ Implementations Through a Developer-informed Oracle” advised by Prof. Rocco Oliveto and co-advised by Prof. Gabriele Bavota, from the Università della Svizzera Italiana and Dr. Simone Scalabrino, from the University of Molise. During the master’s degree, he also obtained a scholarship to work on a research project about machine learning techniques for the automatic analysis of biomedical data.

Complete name

Emanuela Guglielmi

Thesis Topic

Recommender systems for video game developers

Personal Homepage

List of publications

dblp

Short Bio

Emanuela Guglielmi was born in Campobasso (Italy) on March 31th, 1997. She received a Master's Degree in Software Systems Security from the University of Molise (Italy) in 2021 defending a thesis on Software Reliability and Testing entitled “Generative Grammars and Deep Learning for Testing Voice User Interfaces” advised by Prof. Rocco Oliveto and Mr. Giovanni Rosa. She is currently a Ph.D. Student. at the Department of Bioscences and Territory of University of Molise, advised by Prof. Simone Scalabrino. Her research interests include automated testing and recommender systems for complex systems (e.g., virtual assistants and video games).

Students

Complete name

Jonathan Simeone

Degree

Bachelor's Degree

Graduation Year

2019

Currently

Master's Student

Complete name

Federico Zappone

Degree

Bachelor's Degree

Graduation Year

2020

Currently

Master's Student

Alumni


Dr. Salvatore Geremia
Research Fellow
University of Sannio
Italy


Dr. Fabio Palomba
Assistant Professor
University of Salerno
Italy

External Collaborators


Prof. Francesca Arcelli Fontana
University of Milan
Italy


Prof. Gabriele Bavota
Università della Svizzera Italiana
Italy


Prof. Andrea De Lucia
University of Salerno
Italy


Prof. Massimiliano Di Penta
University of Sannio
Italy


Prof. Sonia Haiduc
Florida State University
USA


Prof. Michele Lanza
Università della Svizzera Italiana
Switzerland


Prof. Mario Linares-Vásquez
Universidad de los Andes
Colombia


Prof. Andrian Marcus
The University of Texas at Dallas
USA


Dr. Fiammetta Marulli
University of Naples "Federico II"
Italy


Prof. Laura Moreno
Colorado State University
USA


Prof. Michele Nappi
University of Salerno
Italy


Prof. Denys Poshyvanyk
College of William and Mary
USA

  • Listening to the Crowd for the Release Planning of Mobile Apps.

    Simone Scalabrino, Gabriele Bavota, Barbara Russo, Massimiliano Di Penta, Rocco Oliveto
    IEEE Transactions on Software Engineering (Volume: 45, Issue: 1, January 1 2019)

    Read more
  • Toward a Smell-Aware Bug Prediction Model.

    Fabio Palomba, Marco Zanoni, Francesca Arcelli Fontana, Andrea De Lucia, Rocco Oliveto
    IEEE Transactions on Software Engineering (Volume: 45, Issue: 2, February 1 2019)

    Read more
  • Automatic Identification and Classification of Software Development Video Tutorial Fragments.

    Luca Ponzanelli, Gabriele Bavota, Andrea Mocci, Massimiliano Di Penta, Sonia Haiduc, Barbara Russo, Michele Lanza
    IEEE Transactions on Software Engineering (Volume: 45, Issue: 5, May 1 2019)

    Read more
  • Sentiment analysis for software engineering: how far can we go?

    Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, Michele Lanza, Rocco Oliveto
    2018 40th International Conference on Software Engineering (ICSE)

    Read more
  • A Developer Centered Bug Prediction Model

    Dario Di Nucci, Fabio Palomba, Giuseppe De Rosa, Gabriele Bavota, Rocco Oliveto, Andrea De Lucia
    IEEE Transactions on Software Engineering (Volume: 44, Issue: 1, January 1 2018)

    Read more
  • The Scent of a Smell: An Extensive Comparison Between Textual and Structural Smells

    Fabio Palomba, Annibale Panichella, Andy Zaidman, Rocco Oliveto, Andrea De Lucia
    IEEE Transactions on Software Engineering (Volume: 44, Issue: 10, October 1 2018)

    Read more
  • Supporting software developers with a holistic recommender system.

    Luca Ponzanelli, Simone Scalabrino Gabriele Bavota, Andrea Mocci, Rocco Oliveto, Massimiliano Di Penta, Michele Lanza
    2017 39th International Conference on Software Engineering (ICSE)

    Read more
  • ARENA: An Approach for the Automated Generation of Release Notes.

    Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus, Gerardo Canfora
    IEEE Transactions on Software Engineering (Volume: 43, Issue: 2, February 1 2017)

    Read more
  • When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away).

    Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, Denys Poshyvanyk
    IEEE Transactions on Software Engineering (Volume: 43, Issue: 11, November 1 2017)

    Read more
  • Release planning of mobile apps based on user reviews

    Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, Massimiliano Di Penta
    2016 38th International Conference on Software Engineering (ICSE)

    Read more
  • Too long; didn’t watch!: extracting relevant fragments from software development video tutorials

    Luca Ponzanelli, Gabriele Bavota, Andrea Mocci, Massimiliano Di Penta, Rocco Oliveto, Mir Anamul Hasan, Barbara Russo, Sonia Haiduc, Michele Lanza
    2016 38th International Conference on Software Engineering (ICSE)

    Read more
  • When and Why Your Code Starts to Smell Bad.

    Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, Denys Poshyvanyk
    2015 37th International Conference on Software Engineering (ICSE)

    Read more
  • How Can I Use This Method?

    Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus
    2015 37th International Conference on Software Engineering (ICSE)

    Read more
  • Improving Multi-Objective Test Case Selection by Injecting Diversity in Genetic Algorithms.

    Annibale Panichella, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia
    IEEE Transactions on Software Engineering (Volume: 41, Issue: 4, April 1 2015)

    Read more
  • The Impact of API Change- and Fault-Proneness on the User Ratings of Android Apps

    Gabriele Bavota, Mario Linares Vásquez, Carlos Eduardo Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto, Denys Poshyvanyk
    IEEE Transactions on Software Engineering (Volume: 41, Issue: 4, April 1 2015)

    Read more
  • Mining Version Histories for Detecting Code Smells.

    Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Denys Poshyvanyk, Andrea De Lucia
    IEEE Transactions on Software Engineering (Volume: 41, Issue: 5, May 1 2015)

    Read more
  • REPENT: Analyzing the Nature of Identifier Renamings.

    Venera Arnaoudova, Laleh Mousavi Eshkevari, Massimiliano Di Penta, Rocco Oliveto, Giuliano Antoniol, Yann-Gaël Guéhéneuc
    IEEE Transactions on Software Engineering (Volume: 41, Issue: 5, May 1 2015)

    Read more
  • Methodbook: Recommending Move Method Refactorings via Relational Topic Models.

    Gabriele Bavota, Rocco Oliveto, Malcom Gethers, Denys Poshyvanyk, Andrea De Lucia
    IEEE Transactions on Software Engineering (Volume: 40, Issue: 7, July 1 2014)

    Read more
  • How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms.

    Annibale Panichella, Bogdan Dit, Rocco Oliveto Massimiliano Di Penta, Denys Poshyvanyk, Andrea De Lucia
    2013 35th International Conference on Software Engineering (ICSE)

    Read more
  • Improving Source Code Lexicon via Traceability and Information Retrieval.

    Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto
    IEEE Transactions on Software Engineering (Volume: 37, Issue: 2, February 1 2011)

    Read more

projects

Projects

OCELOT

OCELOT (Optimal Coverage sEarch-based tooL for sOftware Testing) is a new test suite generation tool for C programs implemented in Java. Unlike previous tools for C programs, OCELOT automatically detects the input types of a given C function without requiring any specification of parameters. In addition, the tool handles the different data types of C, including structs and pointers and it is able to produce test suites based on the Check unit testing framework.

ARIES

Aries (Automated Refactoring In EclipSe). During software evolution change is the rule rather than the exception. Unfortunately, such changes are usually performed by developers that due to strict deadlines do not have enough time to make sure that every change conforms to OOP guidelines, such as, minimizing coupling and maximizing cohesion of classes. Such careless design solutions often lead to design antipatterns, which negatively impact the quality of a software system, making its maintenance difficult and expensive. The presence of antipatterns makes the maintenance of a software system difficult (due to the effort required to comprehend the source code) and dangerous (since empirical studies showed that classes with low quality are more error-prone than other classes). Refactoring operations are need to remove such antipatterns from the source code. However, the identification of such operations is not trivial and it might be time-consuming. The ARIES project aims at supporting several refactoring operations in Eclipse, such as Extract Class and Extract Package, Move Method and Move Class.

ATTICUS

ATTICUS (Ambient-intelligent Tele-monitoring and Telemetry for Incepting & Catering over hUman Sustainability) is a tele-service and remote monitoring system for ambient-assisted living based on the analysis of vital and behavioural parameters. ATTICUS finds fertile ground in all contexts where there is a small number of digitized services and home help for the citizen, and for those categories of individuals who tend to be more at risk, such as elderly people or people with disabilities. The aim of the ATTICUS project is to develop an intelligent hardware/software system that can constantly monitor an individual and report anomalies affecting both the health status (through the analysis of vital parameters) and the behaviour, detected through the monitoring and analysis of the moves that the person performs in carrying out his/her activities. The core component is represented by a Smart Wearable, a t-shirt made of innovative fabrics, embedding a data acquisition system (integrating into the fabric) that can measure vital parameters. The electronic device is capable of analysing both home and exterior user movements and to process and store locally acquired data and, whenever possible, transmit them in real time via wireless connection, to a home station (ambient intelligence device) or a monitoring station.

CLAP

CLAP (Crowd Listener for releAse Planning) is a tool designed to support developers in timely addressing several kinds of problems that users report in their reviews on app markets. CLAP provides a web-interface through which developers can easily handle the reviews. As a first step, the tool automatically labels each review as a bug report, a feature request, a performance-, security-, energy-, usability-related issue, or "other" (i.e., non-informative review). Then, to reduce the cost of manually reading all the reviews, it clusters the ones that regard the same issue and it provides some keywords for each cluster. Finally, it prioritizes the clusters, showing in red the critical issues that should be addressed when planning the subsequent app release.

Awards

Awards

  • ICPC 2020
    Most Influential Paper (MIP) Award
    On the Equivalence of Information Retrieval Methods for Automated Traceability Link Recovery:
    A Ten-Year Retrospective
    R. Oliveto, M. Gethers, D. Poshyvanyk, A. De Lucia

    28th International Conference on Program Comprehension

    At ICPC 2010 we presented an empirical study to statistically analyze the equivalence of several traceability recovery methods based on Information Retrieval (IR) techniques. We experimented the Vector Space Model (VSM), Latent Semantic Indexing (LSI), the Jensen-Shannon (JS) method, and Latent Dirichlet Allocation (LDA). Unlike previous empirical studies we did not compare the different IR based traceability recovery methods only using the usual precision and recall metrics. We introduced some metrics to analyze the overlap of the set of candidate links recovered by each method. We also based our analysis on Principal Component Analysis (PCA) to analyze the orthogonality of the experimented methods. The results showed that while the accuracy of LDA was lower than previously used methods, LDA was able to capture some information missed by the other exploited IR methods. Instead, JS, VSM, and LSI were almost equivalent. This paved the way to possible integration of IR based traceability recovery methods. Our paper was one of the first papers experimenting LDA for traceability recovery. Also, the overlap metrics and PCA have been used later to compare and possibly integrate different recommendation approaches not only for traceability recovery, but also for other reverse engineering and software maintenance tasks, such as code smell detection, design pattern detection, and bug prediction.

    Chania
  • MSR 2019
    ACM Sigsoft Distinguished Paper Awards
    Data-Driven Solutions to Detect API Compatibility Issues in Android: An Empirical Study
    S. Scalabrino, G. Bavota, M. Linares-Vasquez, M. Lanza, R. Oliveto

    16th International Conference on Mining Software Repositories

    Android apps are inextricably linked to the official Android APIs. Such a strong form of dependency implies that changes introduced in new versions of the Android APIs can severely impact the apps’ code, for example because of deprecated or removed APIs. In reaction to those changes, mobile app developers are expected to adapt their code and avoid compatibility issues. To support developers, approaches have been proposed to automatically identify API compatibility issues in Android apps. The state-of-the-art approach, named CID, is a data-driven solution learning how to detect those issues by analyzing the changes in the history of Android APIs (“API side” learning). While it can successfully identify compatibility issues, it cannot recommend coding solutions. We devised an alternative data-driven approach, named ACRYL. ACRYL learns from changes implemented in other apps in response to API changes (“client side” learning). This allows not only to detect compatibility issues, but also to suggest a fix. When empirically comparing the two tools, we found that there is no clear winner, since the two approaches are highly complementary, in that they identify almost disjointed sets of API compatibility issues. Our results point to the future possibility of combining the two approaches, trying to learn detection/fixing rules on both the API and the client side.

    Chania
  • ASE 2017
    ACM Sigsoft Distinguished Paper Awards
    Automatically Assessing Code Understandability: How Far Are We?
    S. Scalabrino, G. Bavota, C. Vendome, M. Linares-Vasquez, D. Poshyvanyk, R. Oliveto

    32nd IEEE/ACM International Conference on Automated Software Engineering

    Program understanding plays a pivotal role in software maintenance and evolution: a deep understanding of code is the stepping stone for most software-related activities, such as bug fixing or testing. Being able to measure the understandability of a piece of code might help in estimating the effort required for a maintenance activity, in comparing the quality of alternative implementations, or even in predicting bugs. Unfortunately, there are no existing metrics specifically designed to assess the understandability of a given code snippet. In this paper, we perform a first step in this direction, by studying the extent to which several types of metrics computed on code, documentation, and developers correlate with code understandability. To perform such an investigation we ran a study with 46 participants who were asked to understand eight code snippets each. We collected a total of 324 evaluations aiming at assessing the perceived understandability, the actual level of understanding, and the time needed to understand a code snippet. Our results demonstrate that none of the (existing and new) metrics we considered is able to capture code understandability, not even the ones assumed to assess quality attributes strongly related with it, such as code readability and complexity.

    Chania
  • ICPC 2016
    ACM Sigsoft Distinguished Paper Awards
    Improving Code Readability Models with Textual Features
    S. Scalabrino, M. Linares-Vasquez, D. Poshyvanyk, R. Oliveto

    24th International Conference on Program Comprehension

    Code reading is one of the most frequent activities in software maintenance; before implementing changes, it is necessary to fully understand source code often written by other developers. Thus, readability is a crucial aspect of source code that might significantly influence program comprehension effort. In general, models used to estimate software readability take into account only structural aspects of source code, e.g., line length and a number of comments. However, code is a particular form of text; therefore, a code readability model should not ignore the textual aspects of source code encapsulated in identifiers and comments. In this paper, we propose a set of textual features that could be used to measure code readability. We evaluated the proposed textual features on 600 code snippets manually evaluated (in terms of readability) by 5K+ people. The results show that the proposed features complement classic structural features when predicting readability judgments. Consequently, a code readability model based on a richer set of features, including the ones proposed in this paper, achieves a significantly better accuracy as compared to all the state-of-the-art readability models.

    Chania
  • ICSE 2015
    ACM Sigsoft Distinguished Paper Awards
    When and Why Your Code Starts to Smell Bad
    M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M. Di Penta, A. De Lucia, and D. Poshyvanyk

    37th International Conference on Software Engineering

    In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry and academia. There are several factors that contribute to technical debt. One of these is represented by code bad smells, i.e. symptoms of poor design and implementation choices. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced. To fill this gap, we conducted a large empirical study over the change history of 200 open source projects from different software ecosystems and investigated when bad smells are introduced by developers, and the circumstances and reasons behind their introduction. Our study required the development of a strategy to identify smell-introducing commits, the mining of over 0.5M commits, and the manual analysis of 9,164 of them (i.e. those identified as smell-introducing). Our findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks. In the light of our results, we also call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.

    Chania
  • ESEC FSE 2015
    ACM Sigsoft Distinguished Paper Awards
    Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach
    M. Linares-Vasquez, G. Bavota, C. Bernal-Cardenas, R. Oliveto, M. Di Penta, D. Poshyvanyk

    10th Joint Meeting of the European Software Engineering Conference and the 23rd ACM SIGSOFT Symposium on the Foundations of Software Engineering

    The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems - including apps - targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi-objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements.

    Chania
  • ASE 2013
    ACM Sigsoft Distinguished Paper Awards
    Detecting Bad Smells in Source Code Using Change History Information
    F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, D. Poshyvanyk

    28th IEEE/ACM International Conference on Automated Software Engineering

    Code smells represent symptoms of poor implementation choices. Previous studies found that these smells make source code more difficult to maintain, possibly also increasing its fault-proneness. There are several approaches that identify smells based on code analysis techniques. However, we observe that many code smells are intrinsically characterized by how code elements change over time. Thus, relying solely on structural information may not be sufficient to detect all the smells accurately. We propose an approach to detect five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy, by exploiting change history information mined from versioning systems. We applied approach, coined as HIST (Historical Information for Smell deTection), to eight software projects written in Java, and wherever possible compared with existing state-of-the-art smell detectors based on source code analysis. The results indicate that HIST's precision ranges between 61% and 80%, and its recall ranges between 61% and 100%. More importantly, the results confirm that HIST is able to identify code smells that cannot be identified through approaches solely based on code analysis.

    Chania
  • SCAM 2012
    Best Paper Awards
    When does a Refactoring Induce Bugs? An Empirical Study
    G. Bavota, B. De Carluccio, A. De Lucia, M. Di Penta, R. Oliveto, O. Strollo

    12th IEEE International Working Conference on Source Code Analysis and Manipulation

    Refactorings are - as defined by Fowler - behavior preserving source code transformations. Their main purpose is to improve maintainability or comprehensibility, or also reduce the code footprint if needed. In principle, refactorings are defined as simple operations so that are "unlikely to go wrong" and introduce faults. In practice, refactoring activities could have their risks, as other changes. This paper reports an empirical study carried out on three Java software systems, namely Apache Ant, Xerces, and Argo UML, aimed at investigating to what extent refactoring activities induce faults. Specifically, we automatically detect (and then manually validate) 15,008 refactoring operations (of 52 different kinds) using an existing tool (Ref-Finder). Then, we use the SZZ algorithm to determine whether it is likely that refactorings induced a fault. Results indicate that, while some kinds of refactorings are unlikely to be harmful, others, such as refactorings involving hierarchies (e.g., pull up method), tend to induce faults very frequently. This suggests more accurate code inspection or testing activities when such specific refactorings are performed.

    Chania
  • ICPC 2011
    Best Paper Awards
    Improving IR-based Traceability Recovery Using Smoothing Filters
    A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella

    19th International Conference on Program Comprehension

    Information Retrieval methods have been largely adopted to identify traceability links based on the textual similarity of software artifacts. However, noise due to word usage in software artifacts might negatively affect the recovery accuracy. We propose the use of smoothing filters to reduce the effect of noise in software artifacts and improve the performances of traceability recovery methods. An empirical evaluation performed on two repositories indicates that the usage of a smoothing filter is able to significantly improve the performances of Vector Space Model and Latent Semantic Indexing. Such a result suggests that other than being used for traceability recovery the proposed filter can be used to improve performances of various other software engineering approaches based on textual analysis.

    Chania
  • ICSM ERA 2010
    Best Paper Awards
    Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault Proneness
    V. Arnaoudova, L. Eshkevari, R. Oliveto, Y.-G. Guéhéneuc, G. Antoniol

    26th IEEE International Conference on Software Maintenance - ERA Track

    Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the physical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the conceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context coverage are more fault-prone than others.

    Chania

Contact

Contact

Contact us