Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks

The financial sector accumulates a massive amount of consumer data that contain the most sensitive information daily. These data are strictly limited outside the financial institutions, sometimes even within the same organization, for various reasons such as privacy laws or asset management policy....

Full description

Bibliographic Details
Main Authors:	Nari Park, Yeong Hyeon Gu, Seong Joon Yoo
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Applied Sciences
Subjects:	consumer credit historical data synthetic data generation generative adversarial networks artificial intelligence data mining financial big data
Online Access:	https://www.mdpi.com/2076-3417/11/3/1126

_version_	1797407323177615360
author	Nari Park Yeong Hyeon Gu Seong Joon Yoo
author_facet	Nari Park Yeong Hyeon Gu Seong Joon Yoo
author_sort	Nari Park
collection	DOAJ
description	The financial sector accumulates a massive amount of consumer data that contain the most sensitive information daily. These data are strictly limited outside the financial institutions, sometimes even within the same organization, for various reasons such as privacy laws or asset management policy. Financial data has never been more valuable, especially when assessed jointly with data from different industries, including healthcare, insurance, credit bureau, and research institutions. Therefore, it is critical to generate synthetic datasets that retain the statistical or latent properties of the real datasets as well as the privacy protection guaranteed. In this paper, we apply Generative Adversarial Nets (GANs) to generating synthetic consumer credit data to be used for various educational purposes, specifically in developing machine learning models. GAN is preferable to other pseudonymization methods such as masking, swapping, shuffling, or perturbation, for it does not suffer from adding more attributes or data. This study is significant because it is the first attempt to generate the synthetic data of real-world credit data in practical use. The results find that synthetic consumer credit data using GAN shows a substantial utility without severely compromising privacy and would be a useful resource for big data training programs.
first_indexed	2024-03-09T03:40:40Z
format	Article
id	doaj.art-abc1cd0c45904fdca1a64a2f20657e7e
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T03:40:40Z
publishDate	2021-01-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-abc1cd0c45904fdca1a64a2f20657e7e2023-12-03T14:42:19ZengMDPI AGApplied Sciences2076-34172021-01-01113112610.3390/app11031126Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial NetworksNari Park0Yeong Hyeon Gu1Seong Joon Yoo2Department of Computer Science, Sejong University, Seoul 05006, KoreaDepartment of Computer Science, Sejong University, Seoul 05006, KoreaDepartment of Computer Science, Sejong University, Seoul 05006, KoreaThe financial sector accumulates a massive amount of consumer data that contain the most sensitive information daily. These data are strictly limited outside the financial institutions, sometimes even within the same organization, for various reasons such as privacy laws or asset management policy. Financial data has never been more valuable, especially when assessed jointly with data from different industries, including healthcare, insurance, credit bureau, and research institutions. Therefore, it is critical to generate synthetic datasets that retain the statistical or latent properties of the real datasets as well as the privacy protection guaranteed. In this paper, we apply Generative Adversarial Nets (GANs) to generating synthetic consumer credit data to be used for various educational purposes, specifically in developing machine learning models. GAN is preferable to other pseudonymization methods such as masking, swapping, shuffling, or perturbation, for it does not suffer from adding more attributes or data. This study is significant because it is the first attempt to generate the synthetic data of real-world credit data in practical use. The results find that synthetic consumer credit data using GAN shows a substantial utility without severely compromising privacy and would be a useful resource for big data training programs.https://www.mdpi.com/2076-3417/11/3/1126consumer credit historical datasynthetic data generationgenerative adversarial networksartificial intelligence data miningfinancial big data
spellingShingle	Nari Park Yeong Hyeon Gu Seong Joon Yoo Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks Applied Sciences consumer credit historical data synthetic data generation generative adversarial networks artificial intelligence data mining financial big data
title	Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks
title_full	Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks
title_fullStr	Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks
title_full_unstemmed	Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks
title_short	Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks
title_sort	synthesizing individual consumers credit historical data using generative adversarial networks
topic	consumer credit historical data synthetic data generation generative adversarial networks artificial intelligence data mining financial big data
url	https://www.mdpi.com/2076-3417/11/3/1126
work_keys_str_mv	AT naripark synthesizingindividualconsumerscredithistoricaldatausinggenerativeadversarialnetworks AT yeonghyeongu synthesizingindividualconsumerscredithistoricaldatausinggenerativeadversarialnetworks AT seongjoonyoo synthesizingindividualconsumerscredithistoricaldatausinggenerativeadversarialnetworks

Synthesizing Individual Consumers′ Credit Historical Data Using Generative Adversarial Networks

Similar Items