A generalized solution to verify authorship and detect style change in multi-authored documents
ASONAM '23, November 6–9, 2023, Kusadasi, Turkiye
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
ACM
2024
|
Online Access: | https://hdl.handle.net/1721.1/154062 |
_version_ | 1824458341400707072 |
---|---|
author | Leekha, Rohan Vandam, Courtland |
author2 | Lincoln Laboratory |
author_facet | Lincoln Laboratory Leekha, Rohan Vandam, Courtland |
author_sort | Leekha, Rohan |
collection | MIT |
description | ASONAM '23, November 6–9, 2023, Kusadasi, Turkiye |
first_indexed | 2024-09-23T15:07:18Z |
format | Article |
id | mit-1721.1/154062 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2025-02-19T04:24:21Z |
publishDate | 2024 |
publisher | ACM |
record_format | dspace |
spelling | mit-1721.1/1540622025-01-02T04:51:30Z A generalized solution to verify authorship and detect style change in multi-authored documents Leekha, Rohan Vandam, Courtland Lincoln Laboratory ASONAM '23, November 6–9, 2023, Kusadasi, Turkiye Identifying changes in style can be used to detect multi-authored social media accounts, plagiarism, compromised accounts, and author contributions in long documents. We propose an approach to recognize changes in authorship using large language models. Our approach leverages sentence-level contextual embeddings and semantic relationships. First we expand the training set by adding adversarial examples to the minority class [5], [13], [17]. Then we fine-tune a sequence classification transformer model to detect style change. Our approach outperforms all baselines of PAN21 with macro F1-scores of 0.80, 0.74, and 0.70 for detecting style changepoint between paragraphs, closed-set author ID per paragraph, and style changepoint between sentences, respectively. Our approach also performs better than the leading competitors in PAN22. Also, we achieved a five percent improvement in macro F1-score (0.78) on the newly introduced DarkReddit+ dataset for authorship verification. 2024-04-04T15:20:50Z 2024-04-04T15:20:50Z 2023-11-06 2024-04-01T07:47:47Z Article http://purl.org/eprint/type/ConferencePaper 979-8-4007-0409-3 https://hdl.handle.net/1721.1/154062 Leekha, Rohan and Vandam, Courtland. 2023. "A generalized solution to verify authorship and detect style change in multi-authored documents." PUBLISHER_CC en 10.1145/3625007.3627589 Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ The author(s) application/pdf ACM Association for Computing Machinery |
spellingShingle | Leekha, Rohan Vandam, Courtland A generalized solution to verify authorship and detect style change in multi-authored documents |
title | A generalized solution to verify authorship and detect style change in multi-authored documents |
title_full | A generalized solution to verify authorship and detect style change in multi-authored documents |
title_fullStr | A generalized solution to verify authorship and detect style change in multi-authored documents |
title_full_unstemmed | A generalized solution to verify authorship and detect style change in multi-authored documents |
title_short | A generalized solution to verify authorship and detect style change in multi-authored documents |
title_sort | generalized solution to verify authorship and detect style change in multi authored documents |
url | https://hdl.handle.net/1721.1/154062 |
work_keys_str_mv | AT leekharohan ageneralizedsolutiontoverifyauthorshipanddetectstylechangeinmultiauthoreddocuments AT vandamcourtland ageneralizedsolutiontoverifyauthorshipanddetectstylechangeinmultiauthoreddocuments AT leekharohan generalizedsolutiontoverifyauthorshipanddetectstylechangeinmultiauthoreddocuments AT vandamcourtland generalizedsolutiontoverifyauthorshipanddetectstylechangeinmultiauthoreddocuments |