Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports

In the past few years, there has been abundant research in using machine learning to generate high quality radiology reports using the large MIMIC-CXR chest x-ray dataset. However, there has been little work focused on evaluating the quality of generated reports from a clinical perspective, where ac...

Full description

Bibliographic Details
Main Author: Rawat, Saumya
Other Authors: Szolovits, Peter
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144486
_version_ 1811068707655385088
author Rawat, Saumya
author2 Szolovits, Peter
author_facet Szolovits, Peter
Rawat, Saumya
author_sort Rawat, Saumya
collection MIT
description In the past few years, there has been abundant research in using machine learning to generate high quality radiology reports using the large MIMIC-CXR chest x-ray dataset. However, there has been little work focused on evaluating the quality of generated reports from a clinical perspective, where accuracy is the most important factor. Current evaluation metrics evaluate reports in one dimension. This work proposes the use of multiple dimensions (factual correctness, comprehensiveness, style, and overall quality) to better capture evaluation preferences of a clinical text generating model where preferences can differ based on the use case. This work also presents a dataset of radiologist rating annotations for generated and reference chest x-ray radiology reports. Lastly, it also creates an improved metric for the readability dimension by adding context awareness of frequent and acceptable medical terminology.
first_indexed 2024-09-23T07:59:55Z
format Thesis
id mit-1721.1/144486
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T07:59:55Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1444862022-08-30T04:08:54Z Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports Rawat, Saumya Szolovits, Peter Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In the past few years, there has been abundant research in using machine learning to generate high quality radiology reports using the large MIMIC-CXR chest x-ray dataset. However, there has been little work focused on evaluating the quality of generated reports from a clinical perspective, where accuracy is the most important factor. Current evaluation metrics evaluate reports in one dimension. This work proposes the use of multiple dimensions (factual correctness, comprehensiveness, style, and overall quality) to better capture evaluation preferences of a clinical text generating model where preferences can differ based on the use case. This work also presents a dataset of radiologist rating annotations for generated and reference chest x-ray radiology reports. Lastly, it also creates an improved metric for the readability dimension by adding context awareness of frequent and acceptable medical terminology. M.Eng. 2022-08-29T15:50:46Z 2022-08-29T15:50:46Z 2022-05 2022-05-27T16:19:36.124Z Thesis https://hdl.handle.net/1721.1/144486 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Rawat, Saumya
Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title_full Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title_fullStr Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title_full_unstemmed Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title_short Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
title_sort multi dimensional evaluation metrics for chest x ray reports
url https://hdl.handle.net/1721.1/144486
work_keys_str_mv AT rawatsaumya multidimensionalevaluationmetricsforchestxrayreports