A subject identification method based on term frequency technique
The analyzing and extracting important information from a text document is crucial and has produced interest in the area of text mining and information retrieval. This process is used in order to notice particularly in the text. Furthermore, on view of the readers that people tend to read almost eve...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
ACCENTS
2017
|
Subjects: | |
Online Access: | https://repo.uum.edu.my/id/eprint/25538/1/IJACR%207%2030%202017%20%20103%20110.pdf |
_version_ | 1825805307321778176 |
---|---|
author | Jamil, Nurul Syafidah Ku-Mahamud, Ku Ruhana Mohamed Din, Aniza Ahmad, Faudziah Che Pa, Noraziah Wan Ishak, Wan Hussain Din, Roshidi Ahmad, Farzana Kabir |
author_facet | Jamil, Nurul Syafidah Ku-Mahamud, Ku Ruhana Mohamed Din, Aniza Ahmad, Faudziah Che Pa, Noraziah Wan Ishak, Wan Hussain Din, Roshidi Ahmad, Farzana Kabir |
author_sort | Jamil, Nurul Syafidah |
collection | UUM |
description | The analyzing and extracting important information from a text document is crucial and has produced interest in the area of text mining and information retrieval. This process is used in order to notice particularly in the text. Furthermore, on view of the readers that people tend to read almost everything in text documents to find some specific information. However, reading a text document consumes time to complete and additional time to extract information. Thus,
classifying text to a subject can guide a person to find relevant information. In this paper, a subject identification method which is based on term frequency to categorize groups of text into a particular subject is proposed. Since term frequency tends to ignore the semantics of a document, the term extraction algorithm is introduced for improving the result of the
extracted relevant terms from the text. The evaluation of the extracted terms has shown that the proposed method is exceeded other extraction techniques. |
first_indexed | 2024-07-04T06:30:08Z |
format | Article |
id | uum-25538 |
institution | Universiti Utara Malaysia |
language | English |
last_indexed | 2024-07-04T06:30:08Z |
publishDate | 2017 |
publisher | ACCENTS |
record_format | eprints |
spelling | uum-255382019-01-31T06:43:01Z https://repo.uum.edu.my/id/eprint/25538/ A subject identification method based on term frequency technique Jamil, Nurul Syafidah Ku-Mahamud, Ku Ruhana Mohamed Din, Aniza Ahmad, Faudziah Che Pa, Noraziah Wan Ishak, Wan Hussain Din, Roshidi Ahmad, Farzana Kabir QA75 Electronic computers. Computer science The analyzing and extracting important information from a text document is crucial and has produced interest in the area of text mining and information retrieval. This process is used in order to notice particularly in the text. Furthermore, on view of the readers that people tend to read almost everything in text documents to find some specific information. However, reading a text document consumes time to complete and additional time to extract information. Thus, classifying text to a subject can guide a person to find relevant information. In this paper, a subject identification method which is based on term frequency to categorize groups of text into a particular subject is proposed. Since term frequency tends to ignore the semantics of a document, the term extraction algorithm is introduced for improving the result of the extracted relevant terms from the text. The evaluation of the extracted terms has shown that the proposed method is exceeded other extraction techniques. ACCENTS 2017 Article PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/25538/1/IJACR%207%2030%202017%20%20103%20110.pdf Jamil, Nurul Syafidah and Ku-Mahamud, Ku Ruhana and Mohamed Din, Aniza and Ahmad, Faudziah and Che Pa, Noraziah and Wan Ishak, Wan Hussain and Din, Roshidi and Ahmad, Farzana Kabir (2017) A subject identification method based on term frequency technique. International Journal of Advanced Computer Research, 7 (30). pp. 103-110. ISSN 22497277 http://doi.org/10.19101/IJACR.2017.730020 doi:10.19101/IJACR.2017.730020 doi:10.19101/IJACR.2017.730020 |
spellingShingle | QA75 Electronic computers. Computer science Jamil, Nurul Syafidah Ku-Mahamud, Ku Ruhana Mohamed Din, Aniza Ahmad, Faudziah Che Pa, Noraziah Wan Ishak, Wan Hussain Din, Roshidi Ahmad, Farzana Kabir A subject identification method based on term frequency technique |
title | A subject identification method based on term frequency technique |
title_full | A subject identification method based on term frequency technique |
title_fullStr | A subject identification method based on term frequency technique |
title_full_unstemmed | A subject identification method based on term frequency technique |
title_short | A subject identification method based on term frequency technique |
title_sort | subject identification method based on term frequency technique |
topic | QA75 Electronic computers. Computer science |
url | https://repo.uum.edu.my/id/eprint/25538/1/IJACR%207%2030%202017%20%20103%20110.pdf |
work_keys_str_mv | AT jamilnurulsyafidah asubjectidentificationmethodbasedontermfrequencytechnique AT kumahamudkuruhana asubjectidentificationmethodbasedontermfrequencytechnique AT mohameddinaniza asubjectidentificationmethodbasedontermfrequencytechnique AT ahmadfaudziah asubjectidentificationmethodbasedontermfrequencytechnique AT chepanoraziah asubjectidentificationmethodbasedontermfrequencytechnique AT wanishakwanhussain asubjectidentificationmethodbasedontermfrequencytechnique AT dinroshidi asubjectidentificationmethodbasedontermfrequencytechnique AT ahmadfarzanakabir asubjectidentificationmethodbasedontermfrequencytechnique AT jamilnurulsyafidah subjectidentificationmethodbasedontermfrequencytechnique AT kumahamudkuruhana subjectidentificationmethodbasedontermfrequencytechnique AT mohameddinaniza subjectidentificationmethodbasedontermfrequencytechnique AT ahmadfaudziah subjectidentificationmethodbasedontermfrequencytechnique AT chepanoraziah subjectidentificationmethodbasedontermfrequencytechnique AT wanishakwanhussain subjectidentificationmethodbasedontermfrequencytechnique AT dinroshidi subjectidentificationmethodbasedontermfrequencytechnique AT ahmadfarzanakabir subjectidentificationmethodbasedontermfrequencytechnique |