Early detection of violating Mobile Apps: A data-driven predictive model approach

Mobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike...

Full description

Bibliographic Details
Main Authors: Fadi Mohsen, Dimka Karastoyanova, George Azzopardi
Format: Article
Language:English
Published: Elsevier 2022-12-01
Series:Systems and Soft Computing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772941922000114
_version_ 1797866910240473088
author Fadi Mohsen
Dimka Karastoyanova
George Azzopardi
author_facet Fadi Mohsen
Dimka Karastoyanova
George Azzopardi
author_sort Fadi Mohsen
collection DOAJ
description Mobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike. Thus, in this work, we propose a data-driven predictive approach that determines whether the respective app will be removed or accepted. It also indicates the features’ relevance that helps the stakeholders in the interpretation. In turn, our approach can support developers in improving their apps and users in downloading the ones that are less likely to be removed. We focus on the Google App store and we compile a new data set of 870,515 applications, 56% of which have been removed from the market. Our proposed approach is a bootstrap aggregating of multiple XGBoost machine learning classifiers. We propose two models: user-centered using 47 features, and developer-centered using 37 features, which are available before publishing an app. We achieve the following Areas Under the ROC Curves (AUCs) on the test set: user-centered = 0.792, developer-centered = 0.762.
first_indexed 2024-04-09T23:31:51Z
format Article
id doaj.art-6ff32a5859814c7e857d657fb8ebb50c
institution Directory Open Access Journal
issn 2772-9419
language English
last_indexed 2024-04-09T23:31:51Z
publishDate 2022-12-01
publisher Elsevier
record_format Article
series Systems and Soft Computing
spelling doaj.art-6ff32a5859814c7e857d657fb8ebb50c2023-03-21T04:17:24ZengElsevierSystems and Soft Computing2772-94192022-12-014200045Early detection of violating Mobile Apps: A data-driven predictive model approachFadi Mohsen0Dimka Karastoyanova1George Azzopardi2Corresponding author.; Information Systems Group, Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, 9712 CP Groningen, The NetherlandsInformation Systems Group, Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, 9712 CP Groningen, The NetherlandsInformation Systems Group, Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, 9712 CP Groningen, The NetherlandsMobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike. Thus, in this work, we propose a data-driven predictive approach that determines whether the respective app will be removed or accepted. It also indicates the features’ relevance that helps the stakeholders in the interpretation. In turn, our approach can support developers in improving their apps and users in downloading the ones that are less likely to be removed. We focus on the Google App store and we compile a new data set of 870,515 applications, 56% of which have been removed from the market. Our proposed approach is a bootstrap aggregating of multiple XGBoost machine learning classifiers. We propose two models: user-centered using 47 features, and developer-centered using 37 features, which are available before publishing an app. We achieve the following Areas Under the ROC Curves (AUCs) on the test set: user-centered = 0.792, developer-centered = 0.762.http://www.sciencedirect.com/science/article/pii/S2772941922000114Third-party appsMobile appsApp storesActionsBroadcast receiversPrivacy
spellingShingle Fadi Mohsen
Dimka Karastoyanova
George Azzopardi
Early detection of violating Mobile Apps: A data-driven predictive model approach
Systems and Soft Computing
Third-party apps
Mobile apps
App stores
Actions
Broadcast receivers
Privacy
title Early detection of violating Mobile Apps: A data-driven predictive model approach
title_full Early detection of violating Mobile Apps: A data-driven predictive model approach
title_fullStr Early detection of violating Mobile Apps: A data-driven predictive model approach
title_full_unstemmed Early detection of violating Mobile Apps: A data-driven predictive model approach
title_short Early detection of violating Mobile Apps: A data-driven predictive model approach
title_sort early detection of violating mobile apps a data driven predictive model approach
topic Third-party apps
Mobile apps
App stores
Actions
Broadcast receivers
Privacy
url http://www.sciencedirect.com/science/article/pii/S2772941922000114
work_keys_str_mv AT fadimohsen earlydetectionofviolatingmobileappsadatadrivenpredictivemodelapproach
AT dimkakarastoyanova earlydetectionofviolatingmobileappsadatadrivenpredictivemodelapproach
AT georgeazzopardi earlydetectionofviolatingmobileappsadatadrivenpredictivemodelapproach