Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

Machine-generated text is increasingly difficult to distinguish from text authored by humans. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this...

Full description

Bibliographic Details
Main Authors:	Evan N. Crothers, Nathalie Japkowicz, Herna L. Viktor
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Artificial intelligence cybersecurity disinformation generative AI large language models machine learning
Online Access:	https://ieeexplore.ieee.org/document/10177704/

_version_	1797776968587935744
author	Evan N. Crothers Nathalie Japkowicz Herna L. Viktor
author_facet	Evan N. Crothers Nathalie Japkowicz Herna L. Viktor
author_sort	Evan N. Crothers
collection	DOAJ
description	Machine-generated text is increasingly difficult to distinguish from text authored by humans. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine-generated text is a key countermeasure for reducing the abuse of NLG models, and presents significant technical challenges and numerous open problems. We provide a survey that includes 1) an extensive analysis of threat models posed by contemporary NLG systems and 2) the most complete review of machine-generated text detection methods to date. This survey places machine-generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models. While doing so, we highlight the importance that detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.
first_indexed	2024-03-12T22:57:25Z
format	Article
id	doaj.art-f966d7c90c8545f189fae81638cfcd89
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-12T22:57:25Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-f966d7c90c8545f189fae81638cfcd892023-07-19T23:00:34ZengIEEEIEEE Access2169-35362023-01-0111709777100210.1109/ACCESS.2023.329409010177704Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection MethodsEvan N. Crothers0https://orcid.org/0000-0001-6177-0525Nathalie Japkowicz1https://orcid.org/0000-0003-1176-1617Herna L. Viktor2https://orcid.org/0000-0003-1914-5077School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, CanadaDepartment of Computer Science, American University, Washington, DC, USASchool of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, CanadaMachine-generated text is increasingly difficult to distinguish from text authored by humans. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine-generated text is a key countermeasure for reducing the abuse of NLG models, and presents significant technical challenges and numerous open problems. We provide a survey that includes 1) an extensive analysis of threat models posed by contemporary NLG systems and 2) the most complete review of machine-generated text detection methods to date. This survey places machine-generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models. While doing so, we highlight the importance that detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.https://ieeexplore.ieee.org/document/10177704/Artificial intelligencecybersecuritydisinformationgenerative AIlarge language modelsmachine learning
spellingShingle	Evan N. Crothers Nathalie Japkowicz Herna L. Viktor Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods IEEE Access Artificial intelligence cybersecurity disinformation generative AI large language models machine learning
title	Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
title_full	Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
title_fullStr	Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
title_full_unstemmed	Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
title_short	Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
title_sort	machine generated text a comprehensive survey of threat models and detection methods
topic	Artificial intelligence cybersecurity disinformation generative AI large language models machine learning
url	https://ieeexplore.ieee.org/document/10177704/
work_keys_str_mv	AT evanncrothers machinegeneratedtextacomprehensivesurveyofthreatmodelsanddetectionmethods AT nathaliejapkowicz machinegeneratedtextacomprehensivesurveyofthreatmodelsanddetectionmethods AT hernalviktor machinegeneratedtextacomprehensivesurveyofthreatmodelsanddetectionmethods

Machine-Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

Similar Items