Towards robust steganalysis: binary classifiers and large, heterogeneous data

<p>The security of a steganography system is defined by our ability to detect it. It is of no surprise then that steganography and steganalysis both depend heavily on the accuracy and robustness of our detectors. This is especially true when real-world data is considered, due to its heterogene...

Full description

Bibliographic Details
Main Author: Lubenko, I
Other Authors: Ker, A
Format: Thesis
Language:English
Published: 2013
Subjects:
_version_ 1797092877617070080
author Lubenko, I
author2 Ker, A
author_facet Ker, A
Lubenko, I
author_sort Lubenko, I
collection OXFORD
description <p>The security of a steganography system is defined by our ability to detect it. It is of no surprise then that steganography and steganalysis both depend heavily on the accuracy and robustness of our detectors. This is especially true when real-world data is considered, due to its heterogeneity. The difficulty of such data manifests itself in a penalty that has periodically been reported to affect the performance of detectors built on binary classifiers; this is known as cover source mismatch.</p> <p>It remains unclear how the performance drop that is associated with cover source mismatch is mitigated or even measured. In this thesis we aim to show a robust methodology to empirically measure its effects on the detection accuracy of steganalysis classifiers. Some basic machine-learning based methods, which take their origin in domain adaptation, are proposed to counter it.</p> <p>Specifically, we test two hypotheses through an empirical investigation. First, that linear classifiers are more robust than non-linear classifiers to cover source mismatch in real-world data and, second, that linear classifiers are so robust that given sufficiently large mismatched training data they can equal the performance of any classifier trained on small matched data.</p> <p>With the help of theory we draw several nontrivial conclusions based on our results. The penalty from cover source mismatch may, in fact, be a combination of two types of error; estimation error and adaptation error. We show that relatedness between training and test data, as well as the choice of classifier, both have an impact on adaptation error, which, as we argue, ultimately defines a detector's robustness. This provides a novel framework for reasoning about what is required to improve the robustness of steganalysis detectors. Whilst our empirical results may be viewed as the first step towards this goal, we show that our approach provides clear advantages over earlier methods.</p> <p>To our knowledge this is the first study of this scale and structure.</p>
first_indexed 2024-03-07T03:52:18Z
format Thesis
id oxford-uuid:c1ae44b8-94da-438d-b318-f038ad6aac57
institution University of Oxford
language English
last_indexed 2024-03-07T03:52:18Z
publishDate 2013
record_format dspace
spelling oxford-uuid:c1ae44b8-94da-438d-b318-f038ad6aac572022-03-27T06:03:28ZTowards robust steganalysis: binary classifiers and large, heterogeneous dataThesishttp://purl.org/coar/resource_type/c_db06uuid:c1ae44b8-94da-438d-b318-f038ad6aac57ComputingComputer securityPattern recognition (statistics)SteganalysisInformation HidingEnglishOxford University Research Archive - Valet2013Lubenko, IKer, A<p>The security of a steganography system is defined by our ability to detect it. It is of no surprise then that steganography and steganalysis both depend heavily on the accuracy and robustness of our detectors. This is especially true when real-world data is considered, due to its heterogeneity. The difficulty of such data manifests itself in a penalty that has periodically been reported to affect the performance of detectors built on binary classifiers; this is known as cover source mismatch.</p> <p>It remains unclear how the performance drop that is associated with cover source mismatch is mitigated or even measured. In this thesis we aim to show a robust methodology to empirically measure its effects on the detection accuracy of steganalysis classifiers. Some basic machine-learning based methods, which take their origin in domain adaptation, are proposed to counter it.</p> <p>Specifically, we test two hypotheses through an empirical investigation. First, that linear classifiers are more robust than non-linear classifiers to cover source mismatch in real-world data and, second, that linear classifiers are so robust that given sufficiently large mismatched training data they can equal the performance of any classifier trained on small matched data.</p> <p>With the help of theory we draw several nontrivial conclusions based on our results. The penalty from cover source mismatch may, in fact, be a combination of two types of error; estimation error and adaptation error. We show that relatedness between training and test data, as well as the choice of classifier, both have an impact on adaptation error, which, as we argue, ultimately defines a detector's robustness. This provides a novel framework for reasoning about what is required to improve the robustness of steganalysis detectors. Whilst our empirical results may be viewed as the first step towards this goal, we show that our approach provides clear advantages over earlier methods.</p> <p>To our knowledge this is the first study of this scale and structure.</p>
spellingShingle Computing
Computer security
Pattern recognition (statistics)
Steganalysis
Information Hiding
Lubenko, I
Towards robust steganalysis: binary classifiers and large, heterogeneous data
title Towards robust steganalysis: binary classifiers and large, heterogeneous data
title_full Towards robust steganalysis: binary classifiers and large, heterogeneous data
title_fullStr Towards robust steganalysis: binary classifiers and large, heterogeneous data
title_full_unstemmed Towards robust steganalysis: binary classifiers and large, heterogeneous data
title_short Towards robust steganalysis: binary classifiers and large, heterogeneous data
title_sort towards robust steganalysis binary classifiers and large heterogeneous data
topic Computing
Computer security
Pattern recognition (statistics)
Steganalysis
Information Hiding
work_keys_str_mv AT lubenkoi towardsrobuststeganalysisbinaryclassifiersandlargeheterogeneousdata