A user DNS fingerprint dataset

Using a user DNS fingerprint allows one to identify a specific network user regardless of the knowledge of his IP address. This method is proper, for example, when examining the behavior of a monitored network user in more depth. In contrast to other studies, this work introduces a dataset for possi...

Full description

Bibliographic Details
Main Authors: Josef Zápotocký, Jan Fiala, Jan Fesl
Format: Article
Language:English
Published: Elsevier 2024-06-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340924003585
_version_ 1797203052650823680
author Josef Zápotocký
Jan Fiala
Jan Fesl
author_facet Josef Zápotocký
Jan Fiala
Jan Fesl
author_sort Josef Zápotocký
collection DOAJ
description Using a user DNS fingerprint allows one to identify a specific network user regardless of the knowledge of his IP address. This method is proper, for example, when examining the behavior of a monitored network user in more depth. In contrast to other studies, this work introduces a dataset for possible user identification based only on the knowledge of its DNS fingerprint created from the previously sent DNS queries.We created a large dataset from the real network traffic of a metropolitan Internet service provider. The dataset was created from 2.3 billion DNS queries representing 6.2 million different domain names. The data collection took place over three months from 12/2023 to 02/2024.The dataset contains a detailed user activity description in the sense of overall daily activity statistics and detailed 24 h activity statistics. Each dataset record contains a list of 1137 classification attributes. The absolutely unique feature of this data set is the classification of user activity based on categories of content accessed by a user.The new dataset can be used for the creation of machine learning models, allowing the identification of a specific user without direct knowledge of their IP addresses or additional network location information. The dataset can also serve as a reference dataset for the creation of DNS fingerprints of users.
first_indexed 2024-04-24T08:13:12Z
format Article
id doaj.art-c12cda577485463c85c901b61dab1f23
institution Directory Open Access Journal
issn 2352-3409
language English
last_indexed 2024-04-24T08:13:12Z
publishDate 2024-06-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj.art-c12cda577485463c85c901b61dab1f232024-04-17T04:49:18ZengElsevierData in Brief2352-34092024-06-0154110389A user DNS fingerprint datasetJosef Zápotocký0Jan Fiala1Jan Fesl2Department of Computer Systems, Faculty of Information Technology, Czech Technical University in Prague, Czech RepublicDepartment of Applied Mathematics and Informatics, Faculty of Economics, University of South Bohemia in České Budějovice, Czech RepublicDepartment of Computer Systems, Faculty of Information Technology, Czech Technical University in Prague, Czech Republic; Corresponding author.Using a user DNS fingerprint allows one to identify a specific network user regardless of the knowledge of his IP address. This method is proper, for example, when examining the behavior of a monitored network user in more depth. In contrast to other studies, this work introduces a dataset for possible user identification based only on the knowledge of its DNS fingerprint created from the previously sent DNS queries.We created a large dataset from the real network traffic of a metropolitan Internet service provider. The dataset was created from 2.3 billion DNS queries representing 6.2 million different domain names. The data collection took place over three months from 12/2023 to 02/2024.The dataset contains a detailed user activity description in the sense of overall daily activity statistics and detailed 24 h activity statistics. Each dataset record contains a list of 1137 classification attributes. The absolutely unique feature of this data set is the classification of user activity based on categories of content accessed by a user.The new dataset can be used for the creation of machine learning models, allowing the identification of a specific user without direct knowledge of their IP addresses or additional network location information. The dataset can also serve as a reference dataset for the creation of DNS fingerprints of users.http://www.sciencedirect.com/science/article/pii/S2352340924003585DNSUserMachine learningIdentificationFingerprint
spellingShingle Josef Zápotocký
Jan Fiala
Jan Fesl
A user DNS fingerprint dataset
Data in Brief
DNS
User
Machine learning
Identification
Fingerprint
title A user DNS fingerprint dataset
title_full A user DNS fingerprint dataset
title_fullStr A user DNS fingerprint dataset
title_full_unstemmed A user DNS fingerprint dataset
title_short A user DNS fingerprint dataset
title_sort user dns fingerprint dataset
topic DNS
User
Machine learning
Identification
Fingerprint
url http://www.sciencedirect.com/science/article/pii/S2352340924003585
work_keys_str_mv AT josefzapotocky auserdnsfingerprintdataset
AT janfiala auserdnsfingerprintdataset
AT janfesl auserdnsfingerprintdataset
AT josefzapotocky userdnsfingerprintdataset
AT janfiala userdnsfingerprintdataset
AT janfesl userdnsfingerprintdataset