FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks
Various website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10194906/ |
_version_ | 1797746108538028032 |
---|---|
author | Juneseok Bang Jaewon Jeong Joohyung Lee |
author_facet | Juneseok Bang Jaewon Jeong Joohyung Lee |
author_sort | Juneseok Bang |
collection | DOAJ |
description | Various website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access prohibited content. Due to the advance of artificial intelligence (AI) technologies, machine learning or deep learning techniques have been widely adopted for WF to generate an accurate model to break the privacy of illegal users. Nevertheless, such state-of-the-art approaches to WF assumed that entire data from various Tor nodes are collected and trained in a centralized way to generate the model: However, training data sets from Tor nodes may contain sensitive information that the Tor nodes may not want to share. In addition, significant computing and network bottleneck at the centralized server is inevitable in collecting and training various data in a centralized manner. Correspondingly, this paper proposes a novel framework using federated learning (FL) for WF in the Tor network (denoted as FedFingerprinting), enabling Tor nodes to generate the global model collaboratively without exposing their local data sets. Specifically, to alleviate the burden for local training of selected Tor nodes in the FL process, the importance of various handcrafting features used for WF is firstly evaluated through the analysis of the accuracy of features under the ensemble of tree machine learning methods. Then, to balance the accuracy and training time, the combination of selected top-ranked features is trained using FL approaches rather than raw data in the model. Moreover, considering the local model accuracy of each Tor node, effective Tor node selection for the FL process is also designed. Finally, under closed-world settings with the real-world Tor data sets, we empirically demonstrate the comparisons of the proposed FedFingerprinting with raw data and feature selection compared to various benchmarks in terms of the training time and accuracy. Then, the superior performance of the FedFingerprinting with Tor node selection is evaluated in terms of convergence speed. |
first_indexed | 2024-03-12T15:32:22Z |
format | Article |
id | doaj.art-50b2a28ccbd64e8687f2029cf88f9a1e |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-12T15:32:22Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-50b2a28ccbd64e8687f2029cf88f9a1e2023-08-09T23:00:44ZengIEEEIEEE Access2169-35362023-01-0111784317844410.1109/ACCESS.2023.329917410194906FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor NetworksJuneseok Bang0https://orcid.org/0009-0005-0301-1308Jaewon Jeong1https://orcid.org/0009-0002-2990-185XJoohyung Lee2https://orcid.org/0000-0003-1102-3905Department of Computing, Gachon University, Seongnam, South KoreaDepartment of Computing, Gachon University, Seongnam, South KoreaDepartment of Computing, Gachon University, Seongnam, South KoreaVarious website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access prohibited content. Due to the advance of artificial intelligence (AI) technologies, machine learning or deep learning techniques have been widely adopted for WF to generate an accurate model to break the privacy of illegal users. Nevertheless, such state-of-the-art approaches to WF assumed that entire data from various Tor nodes are collected and trained in a centralized way to generate the model: However, training data sets from Tor nodes may contain sensitive information that the Tor nodes may not want to share. In addition, significant computing and network bottleneck at the centralized server is inevitable in collecting and training various data in a centralized manner. Correspondingly, this paper proposes a novel framework using federated learning (FL) for WF in the Tor network (denoted as FedFingerprinting), enabling Tor nodes to generate the global model collaboratively without exposing their local data sets. Specifically, to alleviate the burden for local training of selected Tor nodes in the FL process, the importance of various handcrafting features used for WF is firstly evaluated through the analysis of the accuracy of features under the ensemble of tree machine learning methods. Then, to balance the accuracy and training time, the combination of selected top-ranked features is trained using FL approaches rather than raw data in the model. Moreover, considering the local model accuracy of each Tor node, effective Tor node selection for the FL process is also designed. Finally, under closed-world settings with the real-world Tor data sets, we empirically demonstrate the comparisons of the proposed FedFingerprinting with raw data and feature selection compared to various benchmarks in terms of the training time and accuracy. Then, the superior performance of the FedFingerprinting with Tor node selection is evaluated in terms of convergence speed.https://ieeexplore.ieee.org/document/10194906/Tor networkswebsite fingerprinting attacksfederated learningfeature analysisdeep learningmachine learning |
spellingShingle | Juneseok Bang Jaewon Jeong Joohyung Lee FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks IEEE Access Tor networks website fingerprinting attacks federated learning feature analysis deep learning machine learning |
title | FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks |
title_full | FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks |
title_fullStr | FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks |
title_full_unstemmed | FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks |
title_short | FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks |
title_sort | fedfingerprinting a federated learning approach to website fingerprinting attacks in tor networks |
topic | Tor networks website fingerprinting attacks federated learning feature analysis deep learning machine learning |
url | https://ieeexplore.ieee.org/document/10194906/ |
work_keys_str_mv | AT juneseokbang fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks AT jaewonjeong fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks AT joohyunglee fedfingerprintingafederatedlearningapproachtowebsitefingerprintingattacksintornetworks |