Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection

Using the universe of Armenian business tax payers operating under a standard tax regime, we develop a fraud prediction model based on machine learning tools, with gradient boosting as the primary choice. Having to deal with broadly defined fraud and heterogeneous taxpayers, as well as a relatively...

Full description

Bibliographic Details
Main Authors: Vardan Baghdasaryan, Hrant Davtyan, Arsine Sarikyan, Zaruhi Navasardyan
Format: Article
Language:English
Published: Taylor & Francis Group 2022-12-01
Series:Applied Artificial Intelligence
Online Access:http://dx.doi.org/10.1080/08839514.2021.2012002
_version_ 1797641065815080960
author Vardan Baghdasaryan
Hrant Davtyan
Arsine Sarikyan
Zaruhi Navasardyan
author_facet Vardan Baghdasaryan
Hrant Davtyan
Arsine Sarikyan
Zaruhi Navasardyan
author_sort Vardan Baghdasaryan
collection DOAJ
description Using the universe of Armenian business tax payers operating under a standard tax regime, we develop a fraud prediction model based on machine learning tools, with gradient boosting as the primary choice. Having to deal with broadly defined fraud and heterogeneous taxpayers, as well as a relatively small sample, we successfully derive important features from tax returns with a minimum of additional information. Among the important fraud predictors, we obtain historical fraud and audit, share of administrative costs, and external economic activity. We see two main contributions with generalizable practical implications for auditing authorities. First, by focusing on the lift score of the top decile, we demonstrate that even moderately accurate models can improve upon existing accuracy of rule-based approaches. Second, and more importantly, we demonstrate that the information contained in the supplier and buyer network of the taxpayer can be used whenever important predictors of fraud such as historical audits and fraud are not available. This is particularly important for situations with newly established companies, who would otherwise be under-rated in terms of fraud probability.
first_indexed 2024-03-11T13:40:09Z
format Article
id doaj.art-3a25941f9da444fda9c8120cd3dcfb7e
institution Directory Open Access Journal
issn 0883-9514
1087-6545
language English
last_indexed 2024-03-11T13:40:09Z
publishDate 2022-12-01
publisher Taylor & Francis Group
record_format Article
series Applied Artificial Intelligence
spelling doaj.art-3a25941f9da444fda9c8120cd3dcfb7e2023-11-02T13:36:37ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452022-12-0136110.1080/08839514.2021.20120022012002Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud DetectionVardan Baghdasaryan0Hrant Davtyan1Arsine Sarikyan2Zaruhi Navasardyan3American University of ArmeniaAmerican University of Armenia, College of Business and EconomicsAmerican University of Armenia, Center for Business Research and DevelopmentAmerican University of Armenia, Center for Business Research and DevelopmentUsing the universe of Armenian business tax payers operating under a standard tax regime, we develop a fraud prediction model based on machine learning tools, with gradient boosting as the primary choice. Having to deal with broadly defined fraud and heterogeneous taxpayers, as well as a relatively small sample, we successfully derive important features from tax returns with a minimum of additional information. Among the important fraud predictors, we obtain historical fraud and audit, share of administrative costs, and external economic activity. We see two main contributions with generalizable practical implications for auditing authorities. First, by focusing on the lift score of the top decile, we demonstrate that even moderately accurate models can improve upon existing accuracy of rule-based approaches. Second, and more importantly, we demonstrate that the information contained in the supplier and buyer network of the taxpayer can be used whenever important predictors of fraud such as historical audits and fraud are not available. This is particularly important for situations with newly established companies, who would otherwise be under-rated in terms of fraud probability.http://dx.doi.org/10.1080/08839514.2021.2012002
spellingShingle Vardan Baghdasaryan
Hrant Davtyan
Arsine Sarikyan
Zaruhi Navasardyan
Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
Applied Artificial Intelligence
title Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
title_full Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
title_fullStr Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
title_full_unstemmed Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
title_short Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer’s Network Data in Fraud Detection
title_sort improving tax audit efficiency using machine learning the role of taxpayer s network data in fraud detection
url http://dx.doi.org/10.1080/08839514.2021.2012002
work_keys_str_mv AT vardanbaghdasaryan improvingtaxauditefficiencyusingmachinelearningtheroleoftaxpayersnetworkdatainfrauddetection
AT hrantdavtyan improvingtaxauditefficiencyusingmachinelearningtheroleoftaxpayersnetworkdatainfrauddetection
AT arsinesarikyan improvingtaxauditefficiencyusingmachinelearningtheroleoftaxpayersnetworkdatainfrauddetection
AT zaruhinavasardyan improvingtaxauditefficiencyusingmachinelearningtheroleoftaxpayersnetworkdatainfrauddetection