Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier

Railway safety is the core of railway transportation guarantee. The unstructured text data of railway safety problems is large, and the content of the text has no specific rules, which makes it very difficult to comprehensively analyze and solve the safety problems. Aiming at the intelligent classif...

Full description

Bibliographic Details
Main Authors: Gao Fan, Wang Fuzhang, Zhang Ming, Zhao Junhua, Li Gaoke
Format: Article
Language:zho
Published: National Computer System Engineering Research Institute of China 2021-04-01
Series:Dianzi Jishu Yingyong
Subjects:
Online Access:http://www.chinaaet.com/article/3000130584
_version_ 1818437747477577728
author Gao Fan
Wang Fuzhang
Zhang Ming
Zhao Junhua
Li Gaoke
author_facet Gao Fan
Wang Fuzhang
Zhang Ming
Zhao Junhua
Li Gaoke
author_sort Gao Fan
collection DOAJ
description Railway safety is the core of railway transportation guarantee. The unstructured text data of railway safety problems is large, and the content of the text has no specific rules, which makes it very difficult to comprehensively analyze and solve the safety problems. Aiming at the intelligent classification of railway safety data, an evolutionary ensemble classifier model is proposed. By analyzing the characteristics of the catenary security issues of data, TF-IDF model is adopted to realize the feature extraction. Bagging ensemble classifier which uses Decision Tree as the base classifier classifies the text data, in the process of classification of Bagging, for the combined solution set of base classifier generated by Bagging Algorithm, Genetic Algorithm is proposed to optimize it to generate the combined solution set of base classifier with better classification results. Based on the safety problem of power supply contact network of a railway bureau, the experimental analysis shows that the TF-IDF+Bagging+Genetic Algorithm=Evolutionary Ensemble Classifier model has a high classification index in the text classification of railway safety problems.
first_indexed 2024-12-14T17:29:35Z
format Article
id doaj.art-958f76a9747649aea0fc78d43b91648d
institution Directory Open Access Journal
issn 0258-7998
language zho
last_indexed 2024-12-14T17:29:35Z
publishDate 2021-04-01
publisher National Computer System Engineering Research Institute of China
record_format Article
series Dianzi Jishu Yingyong
spelling doaj.art-958f76a9747649aea0fc78d43b91648d2022-12-21T22:53:08ZzhoNational Computer System Engineering Research Institute of ChinaDianzi Jishu Yingyong0258-79982021-04-01474717610.16157/j.issn.0258-7998.2002843000130584Text classification of railway safety fault based on TF-IDF evolutionary integrated classifierGao Fan0Wang Fuzhang1Zhang Ming2Zhao Junhua3Li Gaoke4China Academy of Railway Science,Beijing 100081,ChinaChina Academy of Railway Science,Beijing 100081,ChinaChina Academy of Railway Science,Beijing 100081,ChinaBeijing Jingwei Information Technologies Co.,Ltd.,Beijing 100081,ChinaChina Academy of Railway Science,Beijing 100081,ChinaRailway safety is the core of railway transportation guarantee. The unstructured text data of railway safety problems is large, and the content of the text has no specific rules, which makes it very difficult to comprehensively analyze and solve the safety problems. Aiming at the intelligent classification of railway safety data, an evolutionary ensemble classifier model is proposed. By analyzing the characteristics of the catenary security issues of data, TF-IDF model is adopted to realize the feature extraction. Bagging ensemble classifier which uses Decision Tree as the base classifier classifies the text data, in the process of classification of Bagging, for the combined solution set of base classifier generated by Bagging Algorithm, Genetic Algorithm is proposed to optimize it to generate the combined solution set of base classifier with better classification results. Based on the safety problem of power supply contact network of a railway bureau, the experimental analysis shows that the TF-IDF+Bagging+Genetic Algorithm=Evolutionary Ensemble Classifier model has a high classification index in the text classification of railway safety problems.http://www.chinaaet.com/article/3000130584software railway safety problemstf-idfbase classifierintegrated classifierevolutionary integration classifier
spellingShingle Gao Fan
Wang Fuzhang
Zhang Ming
Zhao Junhua
Li Gaoke
Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
Dianzi Jishu Yingyong
software railway safety problems
tf-idf
base classifier
integrated classifier
evolutionary integration classifier
title Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
title_full Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
title_fullStr Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
title_full_unstemmed Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
title_short Text classification of railway safety fault based on TF-IDF evolutionary integrated classifier
title_sort text classification of railway safety fault based on tf idf evolutionary integrated classifier
topic software railway safety problems
tf-idf
base classifier
integrated classifier
evolutionary integration classifier
url http://www.chinaaet.com/article/3000130584
work_keys_str_mv AT gaofan textclassificationofrailwaysafetyfaultbasedontfidfevolutionaryintegratedclassifier
AT wangfuzhang textclassificationofrailwaysafetyfaultbasedontfidfevolutionaryintegratedclassifier
AT zhangming textclassificationofrailwaysafetyfaultbasedontfidfevolutionaryintegratedclassifier
AT zhaojunhua textclassificationofrailwaysafetyfaultbasedontfidfevolutionaryintegratedclassifier
AT ligaoke textclassificationofrailwaysafetyfaultbasedontfidfevolutionaryintegratedclassifier