Classifying Tor Traffic Encrypted Payload Using Machine Learning

Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric...

Full description

Bibliographic Details
Main Authors:	Pitpimon Choorod, George Weir, Anil Fernando
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Network traffic classification Tor network machine learning encrypted payload features character analysis
Online Access:	https://ieeexplore.ieee.org/document/10409147/

_version_	1797316053054783488
author	Pitpimon Choorod George Weir Anil Fernando
author_facet	Pitpimon Choorod George Weir Anil Fernando
author_sort	Pitpimon Choorod
collection	DOAJ
description	Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric routing, and the use of multiple packets for feature computation can lead to processing delays. Recognising the multi-layered encryption of Tor compared to nonTor encrypted payloads, our study explored distinct patterns in their encrypted data. We introduced a novel method using Deep Packet Inspection and machine learning to differentiate between Tor and nonTor traffic based solely on encrypted payload. In the first strand of our research, we investigated hex character analysis of the Tor and nonTor encrypted payloads through statistical testing across 8 groups of application types. Remarkably, our investigation revealed a significant differentiation rate of 94.53% between Tor and nonTor traffic. In the second strand of our research, we aimed to distinguish Tor and nonTor traffic using machine learning, based on encrypted payload features. This proposed feature-based approach proved effective, as evidenced by our classification performance, which attained an average accuracy rate of 95.65% across these 8 groups of applications. Thereby, this study contributes to the efficient classification of Tor and nonTor traffic through features derived solely from a single encrypted payload packet, independent of its position in the traffic flow.
first_indexed	2024-03-08T03:13:44Z
format	Article
id	doaj.art-60dbc33316f24c3ea83cfa4e25e10838
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-08T03:13:44Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-60dbc33316f24c3ea83cfa4e25e108382024-02-13T00:00:54ZengIEEEIEEE Access2169-35362024-01-0112194181943110.1109/ACCESS.2024.335607310409147Classifying Tor Traffic Encrypted Payload Using Machine LearningPitpimon Choorod0https://orcid.org/0000-0002-9279-0710George Weir1https://orcid.org/0000-0002-6264-4480Anil Fernando2Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric routing, and the use of multiple packets for feature computation can lead to processing delays. Recognising the multi-layered encryption of Tor compared to nonTor encrypted payloads, our study explored distinct patterns in their encrypted data. We introduced a novel method using Deep Packet Inspection and machine learning to differentiate between Tor and nonTor traffic based solely on encrypted payload. In the first strand of our research, we investigated hex character analysis of the Tor and nonTor encrypted payloads through statistical testing across 8 groups of application types. Remarkably, our investigation revealed a significant differentiation rate of 94.53% between Tor and nonTor traffic. In the second strand of our research, we aimed to distinguish Tor and nonTor traffic using machine learning, based on encrypted payload features. This proposed feature-based approach proved effective, as evidenced by our classification performance, which attained an average accuracy rate of 95.65% across these 8 groups of applications. Thereby, this study contributes to the efficient classification of Tor and nonTor traffic through features derived solely from a single encrypted payload packet, independent of its position in the traffic flow.https://ieeexplore.ieee.org/document/10409147/Network traffic classificationTor networkmachine learningencrypted payload featurescharacter analysis
spellingShingle	Pitpimon Choorod George Weir Anil Fernando Classifying Tor Traffic Encrypted Payload Using Machine Learning IEEE Access Network traffic classification Tor network machine learning encrypted payload features character analysis
title	Classifying Tor Traffic Encrypted Payload Using Machine Learning
title_full	Classifying Tor Traffic Encrypted Payload Using Machine Learning
title_fullStr	Classifying Tor Traffic Encrypted Payload Using Machine Learning
title_full_unstemmed	Classifying Tor Traffic Encrypted Payload Using Machine Learning
title_short	Classifying Tor Traffic Encrypted Payload Using Machine Learning
title_sort	classifying tor traffic encrypted payload using machine learning
topic	Network traffic classification Tor network machine learning encrypted payload features character analysis
url	https://ieeexplore.ieee.org/document/10409147/
work_keys_str_mv	AT pitpimonchoorod classifyingtortrafficencryptedpayloadusingmachinelearning AT georgeweir classifyingtortrafficencryptedpayloadusingmachinelearning AT anilfernando classifyingtortrafficencryptedpayloadusingmachinelearning

Classifying Tor Traffic Encrypted Payload Using Machine Learning

Similar Items