Extracting causal relations on HIV drug resistance from literature
<p>Abstract</p> <p>Background</p> <p>In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by expert...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2010-02-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/11/101 |
_version_ | 1818114484241170432 |
---|---|
author | Boucher Charles A Nualláin Breanndán Ó Bui Quoc-Chinh Sloot Peter MA |
author_facet | Boucher Charles A Nualláin Breanndán Ó Bui Quoc-Chinh Sloot Peter MA |
author_sort | Boucher Charles A |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by experts in order to be used by virologists and medical doctors. Therefore there is an urgent need for a tool that partially automates this process and is able to retrieve relations between drugs and virus mutations from literature.</p> <p>Results</p> <p>In this work we present a novel method to extract and combine relationships between HIV drugs and mutations in viral genomes. Our extraction method is based on natural language processing (NLP) which produces grammatical relations and applies a set of rules to these relations. We applied our method to a relevant set of PubMed abstracts and obtained 2,434 extracted relations with an estimated performance of 84% for F-score. We then combined the extracted relations using logistic regression to generate resistance values for each <drug, mutation> pair. The results of this relation combination show more than 85% agreement with the Stanford HIVDB for the ten most frequently occurring mutations. The system is used in 5 hospitals from the Virolab project <url>http://www.virolab.org</url> to preselect the most relevant novel resistance data from literature and present those to virologists and medical doctors for further evaluation.</p> <p>Conclusions</p> <p>The proposed relation extraction and combination method has a good performance on extracting HIV drug resistance data. It can be used in large-scale relation extraction experiments. The developed methods can also be applied to extract other type of relations such as gene-protein, gene-disease, and disease-mutation.</p> |
first_indexed | 2024-12-11T03:51:27Z |
format | Article |
id | doaj.art-eb384805ae1c491a963ec94c673aa100 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-11T03:51:27Z |
publishDate | 2010-02-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-eb384805ae1c491a963ec94c673aa1002022-12-22T01:21:53ZengBMCBMC Bioinformatics1471-21052010-02-0111110110.1186/1471-2105-11-101Extracting causal relations on HIV drug resistance from literatureBoucher Charles ANualláin Breanndán ÓBui Quoc-ChinhSloot Peter MA<p>Abstract</p> <p>Background</p> <p>In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by experts in order to be used by virologists and medical doctors. Therefore there is an urgent need for a tool that partially automates this process and is able to retrieve relations between drugs and virus mutations from literature.</p> <p>Results</p> <p>In this work we present a novel method to extract and combine relationships between HIV drugs and mutations in viral genomes. Our extraction method is based on natural language processing (NLP) which produces grammatical relations and applies a set of rules to these relations. We applied our method to a relevant set of PubMed abstracts and obtained 2,434 extracted relations with an estimated performance of 84% for F-score. We then combined the extracted relations using logistic regression to generate resistance values for each <drug, mutation> pair. The results of this relation combination show more than 85% agreement with the Stanford HIVDB for the ten most frequently occurring mutations. The system is used in 5 hospitals from the Virolab project <url>http://www.virolab.org</url> to preselect the most relevant novel resistance data from literature and present those to virologists and medical doctors for further evaluation.</p> <p>Conclusions</p> <p>The proposed relation extraction and combination method has a good performance on extracting HIV drug resistance data. It can be used in large-scale relation extraction experiments. The developed methods can also be applied to extract other type of relations such as gene-protein, gene-disease, and disease-mutation.</p>http://www.biomedcentral.com/1471-2105/11/101 |
spellingShingle | Boucher Charles A Nualláin Breanndán Ó Bui Quoc-Chinh Sloot Peter MA Extracting causal relations on HIV drug resistance from literature BMC Bioinformatics |
title | Extracting causal relations on HIV drug resistance from literature |
title_full | Extracting causal relations on HIV drug resistance from literature |
title_fullStr | Extracting causal relations on HIV drug resistance from literature |
title_full_unstemmed | Extracting causal relations on HIV drug resistance from literature |
title_short | Extracting causal relations on HIV drug resistance from literature |
title_sort | extracting causal relations on hiv drug resistance from literature |
url | http://www.biomedcentral.com/1471-2105/11/101 |
work_keys_str_mv | AT bouchercharlesa extractingcausalrelationsonhivdrugresistancefromliterature AT nuallainbreanndano extractingcausalrelationsonhivdrugresistancefromliterature AT buiquocchinh extractingcausalrelationsonhivdrugresistancefromliterature AT slootpeterma extractingcausalrelationsonhivdrugresistancefromliterature |