Recognition of protein/gene names from text using an ensemble of classifiers

<p>Abstract</p> <p>This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we in...

Full description

Bibliographic Details
Main Authors: Zhou GuoDong, Shen Dan, Zhang Jie, Su Jian, Tan SoonHeng
Format: Article
Language:English
Published: BMC 2005-05-01
Series:BMC Bioinformatics
_version_ 1819117600688308224
author Zhou GuoDong
Shen Dan
Zhang Jie
Su Jian
Tan SoonHeng
author_facet Zhou GuoDong
Shen Dan
Zhang Jie
Su Jian
Tan SoonHeng
author_sort Zhou GuoDong
collection DOAJ
description <p>Abstract</p> <p>This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).</p>
first_indexed 2024-12-22T05:35:34Z
format Article
id doaj.art-981a8ae0b46241c496a5ec6bf413bb66
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-22T05:35:34Z
publishDate 2005-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-981a8ae0b46241c496a5ec6bf413bb662022-12-21T18:37:20ZengBMCBMC Bioinformatics1471-21052005-05-016Suppl 1S710.1186/1471-2105-6-S1-S7Recognition of protein/gene names from text using an ensemble of classifiersZhou GuoDongShen DanZhang JieSu JianTan SoonHeng<p>Abstract</p> <p>This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).</p>
spellingShingle Zhou GuoDong
Shen Dan
Zhang Jie
Su Jian
Tan SoonHeng
Recognition of protein/gene names from text using an ensemble of classifiers
BMC Bioinformatics
title Recognition of protein/gene names from text using an ensemble of classifiers
title_full Recognition of protein/gene names from text using an ensemble of classifiers
title_fullStr Recognition of protein/gene names from text using an ensemble of classifiers
title_full_unstemmed Recognition of protein/gene names from text using an ensemble of classifiers
title_short Recognition of protein/gene names from text using an ensemble of classifiers
title_sort recognition of protein gene names from text using an ensemble of classifiers
work_keys_str_mv AT zhouguodong recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT shendan recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT zhangjie recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT sujian recognitionofproteingenenamesfromtextusinganensembleofclassifiers
AT tansoonheng recognitionofproteingenenamesfromtextusinganensembleofclassifiers