Intro

STAKE labSofTware And Knowledge Engineering lab


The SofTware And Knowledge Engineering (STAKE) Lab at the University of Molise is dedicated to advancing the state of the art in software engineering and knowledge engineering.
The STAKE lab aims also at bringing together the different research activities on software engineering and knowledge engineering carried out at the University of Molise, to exploit synergies among the members and disseminate the most relevant research results.

The main goal of the STAKE lab is to improve users' productivity and develop new methods and tools for supporting decision makers. Specifically, the STAKE lab is interested in the definition of recommendation systems and decision support systems for different activities related to:

development and maintenance of large software systems;
management of projects and resources in large organizations;
optimization of production process.

Want to discuss a new project ?
Want to discuss a new project ?

Your message was sent successfully! We will be in touch as soon as I can.

Opppsss..... Sorry :-(

Something went wrong, try refreshing and submitting the form again.

People

People

Faculty

DIRECTOR

Complete name

Prof. Rocco Oliveto

Phone number

+39 0874 404159

Academic Role

Associate Professor

Short Bio

Rocco Oliveto is Associate Professor at University of Molise (Italy), where he is also the Chair of the Computer Science program anche the Directory of Laboratory of Computer Science and Scientific Computation (CSSC Lab). He co-authored over 100 papers on topics related to software traceability, software maintenance and evolution, search-based software engineering, and empirical software engineering. His activities span various international software engineering research communities. He has served as organizing and program committee of several international conferences in the field of software engineering. He was program co-chair of ICPC 2015, TEFSE 2015 and 2009, SCAM 2014, WCRE 2013 and 2012. He was also keynote speaker at MUD 2012.

Short Bio

Remo Pareschi is Associate Professore of Computer Science (tenured 2009), Department of Bioscience and Territory, University of Molise (Italy), with lecturing activities on Artificial Intelligence, Database Management Systems, Web Intelligence and IT Management. The current research focus is on the definition and implementation of a computational framework that, by combining symbolic and statistical approaches from artificial intelligence and from natural language processing, allows the execution of complex tasks in the domains of the coordination and integration of diverse content/knowledge sources, of social computing and of computer-supported creativity. This research program evolves and complements the one carried out during the late eighties and the nineties, aimed at using formal logic as the basis for a declarative approach to natural understanding and to the programming of distributed systems.

Complete name

Prof. Stefano Ricciardi

Phone number

---

Academic Role

Assistant Professor

Short Bio

Stefano Ricciardi received the BSc in Computer Science, the MSc degree in Informatics and the PhD degree from the University of Salerno. He has been co-founder/owner of a videogame development team focused on 3D sports simulations. He is currently an Assistant Professor at the Department of Biosciences of the University of Molise. His main research interests include biometry, virtual and augmented reality, haptics systems and human-computer interaction. He is member of IEEE and GIRPR/IAPR having authored/co-authored about seventy research papers including international journals, book chapters and conference proceedings. He serves as external expert for the Research Executive Agency of the European Commission.

Ph.D. Students

Complete name

Fabio Palomba

Phone number

+39 089 969381

Tesist Topic

Software Quality/Refactoring

Short Bio

Fabio Palomba was born in Naples (Italy) on August, 3th, 1989. He received (magna cum laude) the Master's Degree in Computer Science from the University of Salerno (Italy) in 2013 defending a thesis on Software Quality, advised by Prof. Andrea De Lucia and Dr. Gabriele Bavota. He got his Bachelor's Degree (cum laude) from the University of Molise, with a thesis on Software Systems Refactoring, proposed by Prof. Rocco Oliveto.
He is currently a PhD student at the Department of Management & Information Technology of the University of Salerno, advided by Prof. Andrea De Lucia and Prof. Rocco Oliveto. His research interests include software evolution and maintainance, mining software repositories and empirical software engineering.

Complete name

Simone Scalabrino

Phone number

+39 0874 404116

Thesis Topic

Testing/Security

Short Bio

Simone Scalabrino was born in Campobasso (Italy) on March, 28th, 1991. He received (magna cum laude) a Master's Degree in Computer Science from the University of Salerno (Italy) in 2015 defending a thesis on Search Based Software Testing, advised by Prof. Andrea De Lucia. He received a Bachelor's Degree (magna cum laude) from the University of Molise in 2013, defending a thesis about source code readability, advised by Prof. Rocco Oliveto and Prof. Denys Poshyvanyk.
He is currently a Ph.D. student at the Department of Biosciences and Territory of University of Molise, advised by Prof. Rocco Oliveto. His research interests include software security, testing and quality.

Complete name

Salvatore Geremia

Phone number

+39 0874 404116

Thesis Topic

Software Quality

Short Bio

Salvatore Geremia was born in Campobasso (Italy) on November 4th, 1990. He is a Ph.D. student at the Department of Biosciences and Territory of University of Molise, advised by Prof. Massimiliano Di Penta. In 2016 he received (magna cum laude) a Master's Degree in Computer Science from the University of Salerno (Italy). He was advised by Prof. Andrea De Lucia and the thesis was on software defect prediction. In 2014 he received a Bachelor's Degree (magna cum laude) from the University of Molise, debating a thesis about software effort estimation. He was advised by Prof. Rocco Oliveto.

Students

Complete name

Stefano Dalla Palma

Degree

---

Graduation Year

---

Currently

Bachelor's Student

Complete name

Ilaria La Torre

Degree

Bachelor's Degree

Graduation Year

2016

Currently

Master's Student

Complete name

Valentina Piantadosi

Degree

Bachelor's Degree

Graduation Year

2016

Currently

Master's Student

External Collaborators


Prof. Francesca Arcelli Fontana
University of Milan
Italy


Prof. Gabriele Bavota
Università della Svizzera Italiana
Italy


Prof. Andrea De Lucia
Università degli Studi di Salerno
Italy


Prof. Massimiliano Di Penta
University of Sannio
Italy


Prof. Sonia Haiduc
Florida State University
USA


Prof. Michele Lanza
Università della Svizzera Italiana
Switzerland


Prof. Mario Linares-Vásquez
Universidad de los Andes
Colombia


Prof. Andrian Marcus
The University of Texad at Dallas
USA


Dr. Fiammetta Marulli
Università Federico II di Napoli
Italy


Prof. Laura Moreno
Colorado State University
USA


Dr. Luca Ponzanelli
Università della Svizzera Italiana
Switzerland


Prof. Denys Poshyvanyk
College of William and Mary
USA

Projects
People
Publications
Awards

Publications

Latest publications

2016

  • Using cohesion and coupling for software remodularization: is it enough?
    I. Candela, G. Bavota, B. Russo and R. Oliveto [To appear]

    In Transactions on Software Engineering and Mehodology.
    ACM Press. 2016

    bib ]

    Abstract

    Refactoring, and in particular, remodularization operations can be performed to repair the design of a software system and remove the erosion caused by software evolution. Various approaches have been proposed to support developers during the remodularization of a software system. Most of these approaches are based on the underlying assumption that developers pursue an optimal balance between quality metrics—such as cohesion and coupling—when modularizing the classes of their systems. Thus, a remodularization recommender proposes a solution that implicitly provides a (near) optimal balance between such quality metrics. However, there is still a lack of empirical evidence that such a balance is the desideratum by developers. This paper aims at bridging this gap by analyzing both objectively and subjectively the aforementioned phenomenon. Specifically, we present the results of (i) a large study analyzing the modularization quality, in terms of package cohesion and coupling, of 100 open source systems, and (ii) a survey conducted with 34 developers aimed at understanding the driving factors they consider when performing modularization tasks. The results achieved have been used to distill a set of lessons learned that might be considered to design more effective remodularization recommenders.

  • Turning the IDE into a self-confident programming assistant
    L. Ponzanelli, G. Bavota, M. Penta, R. Oliveto and M. Lanza [To appear]

    In Empirical Software Engineering Journal.
    Springer Press. 2016

    bib ]

    Abstract

    Developers often require knowledge beyond the one they possess, which boils down to asking co-workers for help or consulting additional sources of information, such as Application Programming Interfaces (API) documentation, forums, and Q&A websites. However, it requires time and energy to formulate one's problem, peruse and process the results. We propose a novel approach that, given a context in the Integrated Development Environment (IDE), automatically retrieves pertinent discussions from StackOverflow, evaluates their relevance using a multi-faceted ranking model, and, if a given confidence threshold is surpassed, notifies the developer. We have implemented our approach in Prompter, an Eclipse plug-in. Prompter was evaluated in two empirical studies. The first study was aimed at evaluating Prompter's ranking model and involved 33 participants. The second study was conducted with 12 participants and aimed at evaluating Prompter's usefulness when supporting developers during development and maintenance tasks. Since Prompter uses "volatile information" crawled from the web, we also replicated Study I after one year to assess the impact of such a "volatility" on recommenders like Prompter. Our results indicate that (i) Prompter recommendations were positively evaluated in 74% of the cases on average, (ii) Prompter significantly helps developers to improve the correctness of their tasks by 24% on average, but also (iii) 78% of the provided recommendations are ``volatile" and can change at one year of distance. While Prompter revealed to be effective, our studies also point out issues when building recommenders based on information available on online forums.

  • Automatic test case generation: what if test code quality matters?
    F. Palomba, A. Panichella, A. Zaidman, R. Oliveto and A. Lucia

    In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016
    pp. 130–141, 2016

    bib | preprint | DOI ]

    Abstract

    Test case generation tools that optimize code coverage have been extensively investigated. Recently, researchers have suggested to add other non-coverage criteria, such as memory consumption or readability, to increase the practical usefulness of generated tests. In this paper, we observe that test code quality metrics, and test cohesion and coupling in particular, are valuable candidates as additional criteria. Indeed, tests with low cohesion and/or high coupling have been shown to have a negative impact on future maintenance activities. In an exploratory investigation we show that most generated tests are indeed affected by poor test code quality. For this reason, we incorporate cohesion and coupling metrics into the main loop of search-based algorithm for test case generation. Through an empirical study we show that our approach is not only able to generate tests that are more cohesive and less coupled, but can (i) increase branch coverage up to 10% when enough time is given to the search and (ii) result in statistically shorter tests.

  • Improving code readability models with textual features
    S. Scalabrino, M. Vásquez, D. Poshyvanyk and R. Oliveto

    In 24th IEEE International Conference on Program Comprehension, ICPC 2016, Austin, TX, USA, May 16-17, 2016
    pp. 1–10, 2016

    bib | preprint | DOI ]

    Abstract

    Code reading is one of the most frequent activities in software maintenance; before implementing changes, it is necessary to fully understand source code often written by other developers. Thus, readability is a crucial aspect of source code that might significantly influence program comprehension effort. In general, models used to estimate software readability take into account only structural aspects of source code, e.g., line length and a number of comments. However, code is a particular form of text; therefore, a code readability model should not ignore the textual aspects of source code encapsulated in identifiers and comments. In this paper, we propose a set of textual features that could be used to measure code readability. We evaluated the proposed textual features on 600 code snippets manually evaluated (in terms of readability) by 5K+ people. The results show that the proposed features complement classic structural features when predicting readability judgments. Consequently, a code readability model based on a richer set of features, including the ones proposed in this paper, achieves a significantly better accuracy as compared to all the state-of-the-art readability models.

  • A textual-based technique for Smell Detection
    F. Palomba, A. Panichella, A. Lucia, R. Oliveto and A. Zaidman

    In 24th IEEE International Conference on Program Comprehension, ICPC 2016, Austin, TX, USA, May 16-17, 2016
    pp. 1–10, 2016

    bib | preprint | DOI ]

    Abstract

    In this paper, we present TACO (Textual Analysis for Code Smell Detection), a technique that exploits textual analysis to detect a family of smells of different nature and different levels of granularity. We run TACO on 10 open source projects, comparing its performance with existing smell detectors purely based on structural information extracted from code components. The analysis of the results indicates that TACO’s precision ranges between 67% and 77%, while its recall ranges between 72% and 84%. Also, TACO often outperforms alternative structural approaches confirming, once again, the usefulness of information that can be derived from the textual part of code components.

  • On the diffusion of test smells in automatically generated test code: an empirical study
    F. Palomba, D. Nucci, A. Panichella, R. Oliveto and A. Lucia

    In Proceedings of the 9th International Workshop on Search-Based Software Testing, SBST@ICSE 2016, Austin, Texas, USA, May 14-22, 2016
    pp. 5–14, 2016

    bib | preprint | DOI ]

    Abstract

    The role of software testing in the software development process is widely recognized as a key activity for successful projects. This is the reason why in the last decade several automatic unit test generation tools have been proposed, focusing particularly on high code coverage. Despite the effort spent by the research community, there is still a lack of empirical investigation aimed at analyzing the characteristics of the produced test code. Indeed, while some studies inspected the effectiveness and the usability of these tools in practice, it is still unknown whether test code is maintainable. In this paper, we conducted a large scale empirical study in order to analyze the diffusion of bad design solutions, namely test smells, in automatically generated unit test classes. Results of the study show the high diffusion of test smells as well as the frequent co-occurrence of different types of design problems. Finally we found that all test smells have strong positive correlation with structural characteristics of the systems such as size or number of classes.

  • Too long; didn’t watch!: extracting relevant fragments from software development video tutorials
    L. Ponzanelli, G. Bavota, A. Mocci, M. Penta, R. Oliveto, M. Hasan, B. Russo, S. Haiduc and M. Lanza

    In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016
    pp. 261–272, 2016

    bib | preprint | DOI ]

    Abstract

    When facing difficulties solving a task at hand, and knowledgeable colleagues are not available, developers resort to offline and online resources, e.g. official documentation, third-party tutorials, mailing lists, and Q&A websites. These, however, need to be found, read, and understood, which takes its toll in terms of time and mental energy. A more immediate and accessible resource are video tutorials found on the web, which in recent years have seen a steep increase in popularity. Nonetheless, videos are an intrinsically noisy data source, and finding the right piece of information might be even more cumbersome than using the previously mentioned resources. We present CodeTube, an approach which mines video tutorials found on the web, and enables developers to query their contents. The video tutorials are processed and split into coherent fragments, to return only fragments related to the query. As an added benefit, the relevant video fragments are complemented with information from additional sources, such as Stack Overflow discussions. The results of two studies to assess CodeTube indicate that video tutorials - if appropriately processed - represent a useful, yet still under-utilized source of information for software development.

  • Release planning of mobile apps based on user reviews
    L. Villarroel, G. Bavota, B. Russo, R. Oliveto and M. Penta

    In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016
    pp. 14–24, 2016

    bib | preprint | DOI ]

    Abstract

    Developers have to to constantly improve their apps by fixing critical bugs and implementing the most desired features in order to gain shares in the continuously increasing and competitive market of mobile apps. A precious source of information to plan such activities is represented by reviews left by users on the app store. However, in order to exploit such information developers need to manually analyze such reviews. This is something not doable if, as frequently happens, the app receives hundreds of reviews per day. In this paper we introduce CLAP (Crowd Listener for releAse Planning), a thorough solution to (i) categorize user reviews based on the information they carry out (e.g., bug reporting), (ii) cluster together related reviews (e.g., all reviews reporting the same bug), and (iii) automatically prioritize the clusters of reviews to be implemented when planning the subsequent app release. We evaluated all the steps behind CLAP, showing its high accuracy in categorizing and clustering reviews and the meaningfulness of the recommended prioritizations. Also, given the availability of CLAP as a working tool, we assessed its practical applicability in industrial environments.

  • CodeTube: extracting relevant fragments from software development video tutorials
    L. Ponzanelli, G. Bavota, A. Mocci, M. Penta, R. Oliveto, B. Russo, S. Haiduc and M. Lanza

    In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016 - Companion Volume
    pp. 645–648, 2016

    bib | preprint | DOI ]

    Abstract

    Nowadays developers heavily rely on sources of informal documentation, including Q&A forums, slides, or video tutorials, the latter being particularly useful to provide introductory notions for a piece of technology. The current practice is that developers have to browse sources individually, which in the case of video tutorials is cumbersome, as they are lengthy and cannot be searched based on their contents. We present CodeTube, a Web-based recommender system that analyzes the contents of video tutorials and is able to provide, given a query, cohesive and self-contained video fragments, along with links to relevant Stack Overflow discussions. CodeTube relies on a combination of textual analysis and image processing applied on video tutorial frames and speech transcripts to split videos into cohesive fragments, index them and identify related Stack Overflow discussions.

  • The Internet of Speaking Things and Its Applications to Cultural Heritage
    F. Marulli, R. Pareschi and D. Baldacci

    In International Conference on Internet of Things and Big Data, Rome, April 2016
    2016

    bib | preprint | DOI ]

    Abstract

    The initial driver for the development of an Internet of Things (IoT) was to provide an infrastructure capable of turning anything into a sensor that acquires and pours data into the cloud, where they can be aggregated with other data and analysed to extract decision-supportive information. The validity of this initial motivation still stands. However, going in the opposite direction is at least as useful and exciting, by exploiting Internet to make things communicate and speak, thus complementing their capability to sense and listen. In this work we present applications of IoT aimed to support the Cultural Heritage environments, but also suitable for Tourism and Smart Urban environments, that advance the available user-experience based on smart devices via the interaction with speaking things. In the first place we describe a system architecture for speaking things, comprehensive of the basic communication protocols for carrying information to the user as well as of higher-level functionalities for content generation and dialogue management. We then show how this architecture is applied to make artworks speak to people. Finally, we introduce speaking holograms as a yet more advanced and interactive application.

projects

Projects

OCELOT

OCELOT (Optimal Coverage sEarch-based tooL for sOftware Testing) is a new test suite generation tool for C programs implemented in Java. Unlike previous tools for C programs, OCELOT automatically detects the input types of a given C function without requiring any specification of parameters. In addition, the tool handles the different data types of C, including structs and pointers and it is able to produce test suites based on the Check unit testing framework.

ARIES

Aries (Automated Refactoring In EclipSe). During software evolution change is the rule rather than the exception. Unfortunately, such changes are usually performed by developers that due to strict deadlines do not have enough time to make sure that every change conforms to OOP guidelines, such as, minimizing coupling and maximizing cohesion of classes. Such careless design solutions often lead to design antipatterns, which negatively impact the quality of a software system, making its maintenance difficult and expensive. The presence of antipatterns makes the maintenance of a software system difficult (due to the effort required to comprehend the source code) and dangerous (since empirical studies showed that classes with low quality are more error-prone than other classes). Refactoring operations are need to remove such antipatterns from the source code. However, the identification of such operations is not trivial and it might be time-consuming. The ARIES project aims at supporting several refactoring operations in Eclipse, such as Extract Class and Extract Package, Move Method and Move Class..

Awards

Awards

  • ICPC 2016
    ACM Sigsoft Distinguished Paper Awards
    Improving Code Readability Models with Textual Features
    S. Scalabrino, M. Linares-Vasquez, D. Poshyvanyk, R. Oliveto

    24th International Conference on Program Comprehension

    Code reading is one of the most frequent activities in software maintenance; before implementing changes, it is necessary to fully understand source code often written by other developers. Thus, readability is a crucial aspect of source code that might significantly influence program comprehension effort. In general, models used to estimate software readability take into account only structural aspects of source code, e.g., line length and a number of comments. However, code is a particular form of text; therefore, a code readability model should not ignore the textual aspects of source code encapsulated in identifiers and comments. In this paper, we propose a set of textual features that could be used to measure code readability. We evaluated the proposed textual features on 600 code snippets manually evaluated (in terms of readability) by 5K+ people. The results show that the proposed features complement classic structural features when predicting readability judgments. Consequently, a code readability model based on a richer set of features, including the ones proposed in this paper, achieves a significantly better accuracy as compared to all the state-of-the-art readability models.

    Chania
  • ICSE 2015
    ACM Sigsoft Distinguished Paper Awards
    When and Why Your Code Starts to Smell Bad
    M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M. Di Penta, A. De Lucia, and D. Poshyvanyk

    37th International Conference on Software Engineering

    In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry and academia. There are several factors that contribute to technical debt. One of these is represented by code bad smells, i.e. symptoms of poor design and implementation choices. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced. To fill this gap, we conducted a large empirical study over the change history of 200 open source projects from different software ecosystems and investigated when bad smells are introduced by developers, and the circumstances and reasons behind their introduction. Our study required the development of a strategy to identify smell-introducing commits, the mining of over 0.5M commits, and the manual analysis of 9,164 of them (i.e. those identified as smell-introducing). Our findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks. In the light of our results, we also call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.

    Chania
  • ESEC FSE 2015
    ACM Sigsoft Distinguished Paper Awards
    Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach
    M. Linares-Vasquez, G. Bavota, C. Bernal-Cardenas, R. Oliveto, M. Di Penta, D. Poshyvanyk

    10th Joint Meeting of the European Software Engineering Conference and the 23rd ACM SIGSOFT Symposium on the Foundations of Software Engineering

    The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems - including apps - targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi-objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements.

    Chania
  • ASE 2013
    ACM Sigsoft Distinguished Paper Awards
    Detecting Bad Smells in Source Code Using Change History Information
    F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, D. Poshyvanyk

    28th IEEE/ACM International Conference on Automated Software Engineering

    Code smells represent symptoms of poor implementation choices. Previous studies found that these smells make source code more difficult to maintain, possibly also increasing its fault-proneness. There are several approaches that identify smells based on code analysis techniques. However, we observe that many code smells are intrinsically characterized by how code elements change over time. Thus, relying solely on structural information may not be sufficient to detect all the smells accurately. We propose an approach to detect five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy, by exploiting change history information mined from versioning systems. We applied approach, coined as HIST (Historical Information for Smell deTection), to eight software projects written in Java, and wherever possible compared with existing state-of-the-art smell detectors based on source code analysis. The results indicate that HIST's precision ranges between 61% and 80%, and its recall ranges between 61% and 100%. More importantly, the results confirm that HIST is able to identify code smells that cannot be identified through approaches solely based on code analysis.

    Chania
  • SCAM 2012
    Best Paper Awards
    When does a Refactoring Induce Bugs? An Empirical Study
    G. Bavota, B. De Carluccio, A. De Lucia, M. Di Penta, R. Oliveto, O. Strollo

    12th IEEE International Working Conference on Source Code Analysis and Manipulation

    Refactorings are - as defined by Fowler - behavior preserving source code transformations. Their main purpose is to improve maintainability or comprehensibility, or also reduce the code footprint if needed. In principle, refactorings are defined as simple operations so that are "unlikely to go wrong" and introduce faults. In practice, refactoring activities could have their risks, as other changes. This paper reports an empirical study carried out on three Java software systems, namely Apache Ant, Xerces, and Ar-go UML, aimed at investigating to what extent refactoring activities induce faults. Specifically, we automatically detect (and then manually validate) 15,008 refactoring operations (of 52 different kinds) using an existing tool (Ref-Finder). Then, we use the SZZ algorithm to determine whether it is likely that refactorings induced a fault. Results indicate that, while some kinds of refactorings are unlikely to be harmful, others, such as refactorings involving hierarchies (e.g., pull up method), tend to induce faults very frequently. This suggests more accurate code inspection or testing activities when such specific refactorings are performed.

    Chania
  • ICPC 2011
    Best Paper Awards
    Improving IR-based Traceability Recovery Using Smoothing Filters
    A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, S. Panichella

    19th International Conference on Program Comprehension

    Information Retrieval methods have been largely adopted to identify traceability links based on the textual similarity of software artifacts. However, noise due to word usage in software artifacts might negatively affect the recovery accuracy. We propose the use of smoothing filters to reduce the effect of noise in software artifacts and improve the performances of traceability recovery methods. An empirical evaluation performed on two repositories indicates that the usage of a smoothing filter is able to significantly improve the performances of Vector Space Model and Latent Semantic Indexing. Such a result suggests that other than being used for traceability recovery the proposed filter can be used to improve performances of various other software engineering approaches based on textual analysis.

    Chania
  • ICSM ERA 2010
    Best Paper Awards
    Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault Proneness
    V. Arnaoudova, L. Eshkevari, R. Oliveto, Y.-G. Guéhéneuc, G. Antoniol

    26th IEEE International Conference on Software Maintenance - ERA Track

    Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the physical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the conceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context coverage are more fault-prone than others.

    Chania

Contact

Contact

Contact us