A Deep Learning Model for Information Loss Prevention From Multi-Page Digital Documents

World Wide Web has redefined almost all the business models in the past twenty-five to thirty years. IoT, Big Data, AI are some of the comparatively recent technologies which brought in a revolution in the digitization and management of data. Along with the revolution arose the need for data securit...

Full description

Bibliographic Details
Main Authors: Abhijit Guha, Debabrata Samanta, Amit Banerjee, Daksh Agarwal
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9443107/
Description
Summary:World Wide Web has redefined almost all the business models in the past twenty-five to thirty years. IoT, Big Data, AI are some of the comparatively recent technologies which brought in a revolution in the digitization and management of data. Along with the revolution arose the need for data security and consumer privacy protection, primarily concerning financial institutions. The data breach of Equifax in 2017 and personal information leaks from Facebook in 2021 led to general skepticism among the customers of large corporations. The GLBA, 1999, also known as the Financial Modernization Act, was implemented by US federal law to enforce the financial institutions to protect their private information. Built upon the GLBA, guidelines are paved by FTC for all financial institutions of the United States of America, including TI companies. In this paper, an ANN-based content classification technique using MLP architecture in combination with n-gram TF-IDF feature descriptor is proposed to detect and protect the customers’ sensitive information of a reputed TI company securing it’s one of the digital image-document stores. The proposed technique is compared with other state-of-the-art strategies. Data samples from the digital document store of the company have been taken into consideration in the study, and the prediction accuracy metrics obtained are found to be substantially better and within the acceptable range defined by the organization’s information security monitoring team.
ISSN:2169-3536