An algorithm for detecting leaks of insider information of financial markets in investment consulting

The paper focuses on revealing insider information leaks of financial markets during investment consulting. An original dataset was created, containing the records of the conversations between consultants and clients, presented in the form of dialogs in text format. The applicability of machine lear...

Full description

Bibliographic Details
Main Authors: Alisa A. Vorobeva, Vladislav V. Gerasimov, Yulia V. Li
Format: Article
Language:English
Published: Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) 2021-06-01
Series:Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
Subjects:
Online Access:https://ntv.ifmo.ru/file/article/20507.pdf
_version_ 1818865477493981184
author Alisa A. Vorobeva
Vladislav V. Gerasimov
Yulia V. Li
author_facet Alisa A. Vorobeva
Vladislav V. Gerasimov
Yulia V. Li
author_sort Alisa A. Vorobeva
collection DOAJ
description The paper focuses on revealing insider information leaks of financial markets during investment consulting. An original dataset was created, containing the records of the conversations between consultants and clients, presented in the form of dialogs in text format. The applicability of machine learning methods for automating the detection of leaks arising in a conversation between a consultant and a client has been studied. The authors examined the applicability of the following supervised machine learning methods for constructing and training a classifier: probabilistic (Naïve Bayes classifier), metric (k-nearest neighbors algorithm), logical (random forest), linear (support vector machine), and methods based on artificial neural networks. The paper considers various approaches to the construction of a natural language text model, such as tokenization (bag of words, word n-grams: bigrams and trigrams) and vectorization (one-hot encoding). The proposed algorithm for detecting financial markets insider information leaks is based on the use of support vector machine (SVM) and tokenization by bigrams. The obtained results demonstrate that SVM and bigram tokenization provide the highest leakage detection accuracy. The research results can be used in cybersecurity tools development, as well as for the further elaboration of natural language processing methods dealing with information security problems.
first_indexed 2024-12-19T10:48:10Z
format Article
id doaj.art-d1645f278e1c408fa059a56948fe044e
institution Directory Open Access Journal
issn 2226-1494
2500-0373
language English
last_indexed 2024-12-19T10:48:10Z
publishDate 2021-06-01
publisher Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
record_format Article
series Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
spelling doaj.art-d1645f278e1c408fa059a56948fe044e2022-12-21T20:25:09ZengSaint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki2226-14942500-03732021-06-0121339440010.17586/2226-1494-2021-21-3-394-400An algorithm for detecting leaks of insider information of financial markets in investment consultingAlisa A. Vorobeva0https://orcid.org/0000-0001-6691-6167Vladislav V. Gerasimov1https://orcid.org/0000-0001-8099-2414Yulia V. Li2https://orcid.org/0000-0003-3280-8197PhD, Associate Professor, ITMO University, Saint Petersburg, 197101, Russian FederationEngineer, ITMO University, Saint Petersburg, 197101, Russian FederationEngineer, ITMO University, Saint Petersburg, 197101, Russian FederationThe paper focuses on revealing insider information leaks of financial markets during investment consulting. An original dataset was created, containing the records of the conversations between consultants and clients, presented in the form of dialogs in text format. The applicability of machine learning methods for automating the detection of leaks arising in a conversation between a consultant and a client has been studied. The authors examined the applicability of the following supervised machine learning methods for constructing and training a classifier: probabilistic (Naïve Bayes classifier), metric (k-nearest neighbors algorithm), logical (random forest), linear (support vector machine), and methods based on artificial neural networks. The paper considers various approaches to the construction of a natural language text model, such as tokenization (bag of words, word n-grams: bigrams and trigrams) and vectorization (one-hot encoding). The proposed algorithm for detecting financial markets insider information leaks is based on the use of support vector machine (SVM) and tokenization by bigrams. The obtained results demonstrate that SVM and bigram tokenization provide the highest leakage detection accuracy. The research results can be used in cybersecurity tools development, as well as for the further elaboration of natural language processing methods dealing with information security problems.https://ntv.ifmo.ru/file/article/20507.pdfnatural language processingmachine learningneural networkscompliance risksinsider information
spellingShingle Alisa A. Vorobeva
Vladislav V. Gerasimov
Yulia V. Li
An algorithm for detecting leaks of insider information of financial markets in investment consulting
Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
natural language processing
machine learning
neural networks
compliance risks
insider information
title An algorithm for detecting leaks of insider information of financial markets in investment consulting
title_full An algorithm for detecting leaks of insider information of financial markets in investment consulting
title_fullStr An algorithm for detecting leaks of insider information of financial markets in investment consulting
title_full_unstemmed An algorithm for detecting leaks of insider information of financial markets in investment consulting
title_short An algorithm for detecting leaks of insider information of financial markets in investment consulting
title_sort algorithm for detecting leaks of insider information of financial markets in investment consulting
topic natural language processing
machine learning
neural networks
compliance risks
insider information
url https://ntv.ifmo.ru/file/article/20507.pdf
work_keys_str_mv AT alisaavorobeva analgorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting
AT vladislavvgerasimov analgorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting
AT yuliavli analgorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting
AT alisaavorobeva algorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting
AT vladislavvgerasimov algorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting
AT yuliavli algorithmfordetectingleaksofinsiderinformationoffinancialmarketsininvestmentconsulting