Abstract

The market for mobile apps is getting bigger and bigger every day, and it is expected to be worth over 100 Billion dollars in 2020. To have a chance to succeed in such a competitive environment, developers need to build and maintain high-quality apps, rapidly fixing any bug and continuously astonish their users with the coolest new features. Peculiar to the mobile apps market is the users’ reviews mechanism: Users are allowed to assign a rating to the app they download and express their feelings while using it. Such reviews are mainly designed as a feedback mechanism between users aimed at recommending each other which apps are worth downloading. However, they also contain precious information for software developers, reporting bugs (e.g., unexpected behaviours) and recommending new features to implement. To exploit such a source of information developers are supposed to manually read the user reviews, something simply not doable when hundreds of reviews per day are collected (as usual for popular apps). To help developers dealing with such a task, we developed CLAP (Crowd Listener for releAse Planning), a web application able to (i) categorize user reviews based on the information they carry out (e.g., reporting security issues), (ii) cluster together related reviews (e.g., all those reporting the same bug), and (iii) prioritize the clusters of reviews to be implemented when planning the subsequent app release. We evaluated all the steps behind CLAP, showing its high accuracy in categorizing and clustering reviews and the meaningfulness of the recommended prioritizations. Also, given the availability of CLAP as a working tool, we assessed its practical applicability in industrial environments.

Errata

There is an error in the results concerning RQ3 (prioritization accuracy). Because of a bug in the code we used to compute the results, some clusters were ignored (11/211, i.e. 5.5%). We report here the correction of Table 3 (Total number of clusters and high/low-priority clusters for each category used to answer RQ3) and Table 7 (RQ3: Prioritization accuracy). We would like to thank Mashael Etaiwi for helping us in spotting this issue.

Table 3. Total number of clusters and high/low-priority clusters for each category used to answer RQ3.
	Total	Low priority	High priority
Functional bug report	114	13	101
Sugg. new feature	78	11	67
Report of security issues	4	0	4
Report of performance problems	6	1	5
Report of excessive energy consumption	2	0	2
Report for usability improvements	7	2	5
Total	211	27	184

Table 7. RQ3: Prioritization accuracy.
	Accuracy	False positives	False negatives
Functional bug report	91%	5%	4%
Sugg. new feature	81%	10%	9%
Non-functional	79%	10%	10%

Listening to the Crowd for the Release Planning of Mobile Apps

Replication Package

Authors

Simone Scalabrino

University of Molise, Italy

Gabriele Bavota

Free University of Bozen-Bolzano, Italy

Barbara Russo

Free University of Bozen-Bolzano, Italy

Rocco Oliveto

University of Molise, Italy

Massimiliano Di Penta

University of Sannio, Italy

Abstract

Errata

Raw Data

Listening to the Crowd for the Release Planning of Mobile Apps Replication Package

Authors

Simone Scalabrino

University of Molise, Italy

Gabriele Bavota

Free University of Bozen-Bolzano, Italy

Barbara Russo

Free University of Bozen-Bolzano, Italy

Rocco Oliveto

University of Molise, Italy

Massimiliano Di Penta

University of Sannio, Italy

Abstract

Errata

Raw Data

Listening to the Crowd for the Release Planning of Mobile Apps

Replication Package