Towards Debiasing Fact Verification Models

© 2019 Association for Computational Linguistics Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this...

Full description

Bibliographic Details
Main Authors: Schuster, Tal, Shah, Darsh J, Yeo, Yun Jie Serene, Filizzola, Daniel, Santus, Enrico, Barzilay, Regina
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Association for Computational Linguistics 2021
Online Access:https://hdl.handle.net/1721.1/137401.2
_version_ 1811069313340145664
author Schuster, Tal
Shah, Darsh J
Yeo, Yun Jie Serene
Filizzola, Daniel
Santus, Enrico
Barzilay, Regina
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Schuster, Tal
Shah, Darsh J
Yeo, Yun Jie Serene
Filizzola, Daniel
Santus, Enrico
Barzilay, Regina
author_sort Schuster, Tal
collection MIT
description © 2019 Association for Computational Linguistics Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.
first_indexed 2024-09-23T08:09:07Z
format Article
id mit-1721.1/137401.2
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T08:09:07Z
publishDate 2021
publisher Association for Computational Linguistics
record_format dspace
spelling mit-1721.1/137401.22021-11-15T15:59:16Z Towards Debiasing Fact Verification Models Schuster, Tal Shah, Darsh J Yeo, Yun Jie Serene Filizzola, Daniel Santus, Enrico Barzilay, Regina Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science © 2019 Association for Computational Linguistics Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models. DSO (Grant DSOCL18002) 2021-11-15T15:59:15Z 2021-11-04T19:16:09Z 2021-11-15T15:59:15Z 2019 2020-12-01T16:49:55Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137401.2 Schuster, Tal, Shah, Darsh, Yeo, Yun Jie Serene, Roberto Filizzola Ortiz, Daniel, Santus, Enrico et al. 2019. "Towards Debiasing Fact Verification Models." EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference. en 10.18653/V1/D19-1341 EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/octet-stream Association for Computational Linguistics Association for Computational Linguistics
spellingShingle Schuster, Tal
Shah, Darsh J
Yeo, Yun Jie Serene
Filizzola, Daniel
Santus, Enrico
Barzilay, Regina
Towards Debiasing Fact Verification Models
title Towards Debiasing Fact Verification Models
title_full Towards Debiasing Fact Verification Models
title_fullStr Towards Debiasing Fact Verification Models
title_full_unstemmed Towards Debiasing Fact Verification Models
title_short Towards Debiasing Fact Verification Models
title_sort towards debiasing fact verification models
url https://hdl.handle.net/1721.1/137401.2
work_keys_str_mv AT schustertal towardsdebiasingfactverificationmodels
AT shahdarshj towardsdebiasingfactverificationmodels
AT yeoyunjieserene towardsdebiasingfactverificationmodels
AT filizzoladaniel towardsdebiasingfactverificationmodels
AT santusenrico towardsdebiasingfactverificationmodels
AT barzilayregina towardsdebiasingfactverificationmodels