Classifying Tor Traffic Encrypted Payload Using Machine Learning
Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10409147/ |
_version_ | 1797316053054783488 |
---|---|
author | Pitpimon Choorod George Weir Anil Fernando |
author_facet | Pitpimon Choorod George Weir Anil Fernando |
author_sort | Pitpimon Choorod |
collection | DOAJ |
description | Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric routing, and the use of multiple packets for feature computation can lead to processing delays. Recognising the multi-layered encryption of Tor compared to nonTor encrypted payloads, our study explored distinct patterns in their encrypted data. We introduced a novel method using Deep Packet Inspection and machine learning to differentiate between Tor and nonTor traffic based solely on encrypted payload. In the first strand of our research, we investigated hex character analysis of the Tor and nonTor encrypted payloads through statistical testing across 8 groups of application types. Remarkably, our investigation revealed a significant differentiation rate of 94.53% between Tor and nonTor traffic. In the second strand of our research, we aimed to distinguish Tor and nonTor traffic using machine learning, based on encrypted payload features. This proposed feature-based approach proved effective, as evidenced by our classification performance, which attained an average accuracy rate of 95.65% across these 8 groups of applications. Thereby, this study contributes to the efficient classification of Tor and nonTor traffic through features derived solely from a single encrypted payload packet, independent of its position in the traffic flow. |
first_indexed | 2024-03-08T03:13:44Z |
format | Article |
id | doaj.art-60dbc33316f24c3ea83cfa4e25e10838 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T03:13:44Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-60dbc33316f24c3ea83cfa4e25e108382024-02-13T00:00:54ZengIEEEIEEE Access2169-35362024-01-0112194181943110.1109/ACCESS.2024.335607310409147Classifying Tor Traffic Encrypted Payload Using Machine LearningPitpimon Choorod0https://orcid.org/0000-0002-9279-0710George Weir1https://orcid.org/0000-0002-6264-4480Anil Fernando2Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Department of Computer and Information Sciences, University of Strathclyde, Glasgow, U.K.Tor, a network offering Internet anonymity, presented both positive and potentially malicious applications, leading to the need for efficient Tor traffic monitoring. While most current traffic classification methods rely on flow-based features, these can be unreliable due to factors like asymmetric routing, and the use of multiple packets for feature computation can lead to processing delays. Recognising the multi-layered encryption of Tor compared to nonTor encrypted payloads, our study explored distinct patterns in their encrypted data. We introduced a novel method using Deep Packet Inspection and machine learning to differentiate between Tor and nonTor traffic based solely on encrypted payload. In the first strand of our research, we investigated hex character analysis of the Tor and nonTor encrypted payloads through statistical testing across 8 groups of application types. Remarkably, our investigation revealed a significant differentiation rate of 94.53% between Tor and nonTor traffic. In the second strand of our research, we aimed to distinguish Tor and nonTor traffic using machine learning, based on encrypted payload features. This proposed feature-based approach proved effective, as evidenced by our classification performance, which attained an average accuracy rate of 95.65% across these 8 groups of applications. Thereby, this study contributes to the efficient classification of Tor and nonTor traffic through features derived solely from a single encrypted payload packet, independent of its position in the traffic flow.https://ieeexplore.ieee.org/document/10409147/Network traffic classificationTor networkmachine learningencrypted payload featurescharacter analysis |
spellingShingle | Pitpimon Choorod George Weir Anil Fernando Classifying Tor Traffic Encrypted Payload Using Machine Learning IEEE Access Network traffic classification Tor network machine learning encrypted payload features character analysis |
title | Classifying Tor Traffic Encrypted Payload Using Machine Learning |
title_full | Classifying Tor Traffic Encrypted Payload Using Machine Learning |
title_fullStr | Classifying Tor Traffic Encrypted Payload Using Machine Learning |
title_full_unstemmed | Classifying Tor Traffic Encrypted Payload Using Machine Learning |
title_short | Classifying Tor Traffic Encrypted Payload Using Machine Learning |
title_sort | classifying tor traffic encrypted payload using machine learning |
topic | Network traffic classification Tor network machine learning encrypted payload features character analysis |
url | https://ieeexplore.ieee.org/document/10409147/ |
work_keys_str_mv | AT pitpimonchoorod classifyingtortrafficencryptedpayloadusingmachinelearning AT georgeweir classifyingtortrafficencryptedpayloadusingmachinelearning AT anilfernando classifyingtortrafficencryptedpayloadusingmachinelearning |