Multimodal modeling with low-dose CT and clinical information for diagnostic artificial intelligence on mediastinal tumors: a preliminary study

Background Diagnosing mediastinal tumours, including incidental lesions, using low-dose CT (LDCT) performed for lung cancer screening, is challenging. It often requires additional invasive and costly tests for proper characterisation and surgical planning. This indicates the need for a more efficien...

Full description

Bibliographic Details
Main Authors: Daisuke Yamada, Fumitsugu Kojima, Yujiro Otsuka, Kouhei Kawakami, Naoki Koishi, Ken Oba, Toru Bando, Masaki Matsusako, Yasuyuki Kurihara
Format: Article
Language:English
Published: BMJ Publishing Group 2024-04-01
Series:BMJ Open Respiratory Research
Online Access:https://bmjopenrespres.bmj.com/content/11/1/e002249.full
Description
Summary:Background Diagnosing mediastinal tumours, including incidental lesions, using low-dose CT (LDCT) performed for lung cancer screening, is challenging. It often requires additional invasive and costly tests for proper characterisation and surgical planning. This indicates the need for a more efficient and patient-centred approach, suggesting a gap in the existing diagnostic methods and the potential for artificial intelligence technologies to address this gap. This study aimed to create a multimodal hybrid transformer model using the Vision Transformer that leverages LDCT features and clinical data to improve surgical decision-making for patients with incidentally detected mediastinal tumours.Methods This retrospective study analysed patients with mediastinal tumours between 2010 and 2021. Patients eligible for surgery (n=30) were considered ‘positive,’ whereas those without tumour enlargement (n=32) were considered ‘negative.’ We developed a hybrid model combining a convolutional neural network with a transformer to integrate imaging and clinical data. The dataset was split in a 5:3:2 ratio for training, validation and testing. The model’s efficacy was evaluated using a receiver operating characteristic (ROC) analysis across 25 iterations of random assignments and compared against conventional radiomics models and models excluding clinical data.Results The multimodal hybrid model demonstrated a mean area under the curve (AUC) of 0.90, significantly outperforming the non-clinical data model (AUC=0.86, p=0.04) and radiomics models (random forest AUC=0.81, p=0.008; logistic regression AUC=0.77, p=0.004).Conclusion Integrating clinical and LDCT data using a hybrid transformer model can improve surgical decision-making for mediastinal tumours, showing superiority over models lacking clinical data integration.
ISSN:2052-4439