FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks

Various website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access...

Full description

Bibliographic Details
Main Authors:	Juneseok Bang, Jaewon Jeong, Joohyung Lee
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Tor networks website fingerprinting attacks federated learning feature analysis deep learning machine learning
Online Access:	https://ieeexplore.ieee.org/document/10194906/

_version_	1827868501354741760
author	Juneseok Bang Jaewon Jeong Joohyung Lee
author_facet	Juneseok Bang Jaewon Jeong Joohyung Lee
author_sort	Juneseok Bang
collection	DOAJ
description	Various website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access prohibited content. Due to the advance of artificial intelligence (AI) technologies, machine learning or deep learning techniques have been widely adopted for WF to generate an accurate model to break the privacy of illegal users. Nevertheless, such state-of-the-art approaches to WF assumed that entire data from various Tor nodes are collected and trained in a centralized way to generate the model: However, training data sets from Tor nodes may contain sensitive information that the Tor nodes may not want to share. In addition, significant computing and network bottleneck at the centralized server is inevitable in collecting and training various data in a centralized manner. Correspondingly, this paper proposes a novel framework using federated learning (FL) for WF in the Tor network (denoted as FedFingerprinting), enabling Tor nodes to generate the global model collaboratively without exposing their local data sets. Specifically, to alleviate the burden for local training of selected Tor nodes in the FL process, the importance of various handcrafting features used for WF is firstly evaluated through the analysis of the accuracy of features under the ensemble of tree machine learning methods. Then, to balance the accuracy and training time, the combination of selected top-ranked features is trained using FL approaches rather than raw data in the model. Moreover, considering the local model accuracy of each Tor node, effective Tor node selection for the FL process is also designed. Finally, under closed-world settings with the real-world Tor data sets, we empirically demonstrate the comparisons of the proposed FedFingerprinting with raw data and feature selection compared to various benchmarks in terms of the training time and accuracy. Then, the superior performance of the FedFingerprinting with Tor node selection is evaluated in terms of convergence speed.
first_indexed	2024-03-12T15:32:22Z
format	Article
id	doaj.art-50b2a28ccbd64e8687f2029cf88f9a1e
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-12T15:32:22Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-50b2a28ccbd64e8687f2029cf88f9a1e2023-08-09T23:00:44ZengIEEEIEEE Access2169-35362023-01-0111784317844410.1109/ACCESS.2023.329917410194906FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor NetworksJuneseok Bang0https://orcid.org/0009-0005-0301-1308Jaewon Jeong1https://orcid.org/0009-0002-2990-185XJoohyung Lee2https://orcid.org/0000-0003-1102-3905Department of Computing, Gachon University, Seongnam, South KoreaDepartment of Computing, Gachon University, Seongnam, South KoreaDepartment of Computing, Gachon University, Seongnam, South KoreaVarious website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access prohibited content. Due to the advance of artificial intelligence (AI) technologies, machine learning or deep learning techniques have been widely adopted for WF to generate an accurate model to break the privacy of illegal users. Nevertheless, such state-of-the-art approaches to WF assumed that entire data from various Tor nodes are collected and trained in a centralized way to generate the model: However, training data sets from Tor nodes may contain sensitive information that the Tor nodes may not want to share. In addition, significant computing and network bottleneck at the centralized server is inevitable in collecting and training various data in a centralized manner. Correspondingly, this paper proposes a novel framework using federated learning (FL) for WF in the Tor network (denoted as FedFingerprinting), enabling Tor nodes to generate the global model collaboratively without exposing their local data sets. Specifically, to alleviate the burden for local training of selected Tor nodes in the FL process, the importance of various handcrafting features used for WF is firstly evaluated through the analysis of the accuracy of features under the ensemble of tree machine learning methods. Then, to balance the accuracy and training time, the combination of selected top-ranked features is trained using FL approaches rather than raw data in the model. Moreover, considering the local model accuracy of each Tor node, effective Tor node selection for the FL process is also designed. Finally, under closed-world settings with the real-world Tor data sets, we empirically demonstrate the comparisons of the proposed FedFingerprinting with raw data and feature selection compared to various benchmarks in terms of the training time and accuracy. Then, the superior performance of the FedFingerprinting with Tor node selection is evaluated in terms of convergence speed.https://ieeexplore.ieee.org/document/10194906/Tor networkswebsite fingerprinting attacksfederated learningfeature analysisdeep learningmachine learning
spellingShingle	Juneseok Bang Jaewon Jeong Joohyung Lee FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks IEEE Access Tor networks website fingerprinting attacks federated learning feature analysis deep learning machine learning
title	FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
title_full	FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
title_fullStr	FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
title_full_unstemmed	FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
title_short	FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
title_sort	fedfingerprinting a federated learning approach to website fingerprinting attacks in tor networks
topic	Tor networks website fingerprinting attacks federated learning feature analysis deep learning machine learning
url	https://ieeexplore.ieee.org/document/10194906/
work_keys_str_mv	AT juneseokbang fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks AT jaewonjeong fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks AT joohyunglee fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks

FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks

Similar Items