Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals
Classifying citations according to function has many benefits when it comes to information retrieval tasks, scholarly communication studies, and ranking metric developments. Many citation function classification schemes have been proposed, but most of them have not been systematically designed for a...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-06-01
|
Series: | Publications |
Subjects: | |
Online Access: | https://www.mdpi.com/2304-6775/9/3/27 |
_version_ | 1797528460103516160 |
---|---|
author | Yaniasih Yaniasih Indra Budi |
author_facet | Yaniasih Yaniasih Indra Budi |
author_sort | Yaniasih Yaniasih |
collection | DOAJ |
description | Classifying citations according to function has many benefits when it comes to information retrieval tasks, scholarly communication studies, and ranking metric developments. Many citation function classification schemes have been proposed, but most of them have not been systematically designed for an extensive literature-based compilation process. Many schemes were also not evaluated properly before being used for classification experiments utilizing large datasets. This paper aimed to build and evaluate new citation function categories based upon sufficient scientific evidence. A total of 2153 citation sentences were collected from Indonesian journal articles for our dataset. To identify the new categories, a literature survey was conducted, analyses and groupings of category meanings were carried out, and then categories were selected based on the dataset’s characteristics and the purpose of the classification. The evaluation used five criteria: coherence, ease, utility, balance, and coverage. Fleiss’ kappa and automatic classification metrics using machine learning and deep learning algorithms were used to assess the criteria. These methods resulted in five citation function categories. The scheme’s coherence and ease of use were quite good, as indicated by an inter-annotator agreement value of 0.659 and a Long Short-Term Memory (LSTM) F1-score of 0.93. According to the balance and coverage criteria, the scheme still needs to be improved. This research data was limited to journals in food science published in Indonesia. Future research will involve classifying the citation function using a massive dataset collected from various scientific fields and published from some representative countries, as well as applying improved annotation schemes and deep learning methods. |
first_indexed | 2024-03-10T09:59:37Z |
format | Article |
id | doaj.art-57fdec10b6904e759b1c3d06809d27b4 |
institution | Directory Open Access Journal |
issn | 2304-6775 |
language | English |
last_indexed | 2024-03-10T09:59:37Z |
publishDate | 2021-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Publications |
spelling | doaj.art-57fdec10b6904e759b1c3d06809d27b42023-11-22T02:03:20ZengMDPI AGPublications2304-67752021-06-01932710.3390/publications9030027Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian JournalsYaniasih Yaniasih0Indra Budi1Faculty of Computer Science, Universitas Indonesia, Depok 16424, IndonesiaFaculty of Computer Science, Universitas Indonesia, Depok 16424, IndonesiaClassifying citations according to function has many benefits when it comes to information retrieval tasks, scholarly communication studies, and ranking metric developments. Many citation function classification schemes have been proposed, but most of them have not been systematically designed for an extensive literature-based compilation process. Many schemes were also not evaluated properly before being used for classification experiments utilizing large datasets. This paper aimed to build and evaluate new citation function categories based upon sufficient scientific evidence. A total of 2153 citation sentences were collected from Indonesian journal articles for our dataset. To identify the new categories, a literature survey was conducted, analyses and groupings of category meanings were carried out, and then categories were selected based on the dataset’s characteristics and the purpose of the classification. The evaluation used five criteria: coherence, ease, utility, balance, and coverage. Fleiss’ kappa and automatic classification metrics using machine learning and deep learning algorithms were used to assess the criteria. These methods resulted in five citation function categories. The scheme’s coherence and ease of use were quite good, as indicated by an inter-annotator agreement value of 0.659 and a Long Short-Term Memory (LSTM) F1-score of 0.93. According to the balance and coverage criteria, the scheme still needs to be improved. This research data was limited to journals in food science published in Indonesia. Future research will involve classifying the citation function using a massive dataset collected from various scientific fields and published from some representative countries, as well as applying improved annotation schemes and deep learning methods.https://www.mdpi.com/2304-6775/9/3/27citation functionclassification schemeannotator agreementmachine learningdeep learning |
spellingShingle | Yaniasih Yaniasih Indra Budi Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals Publications citation function classification scheme annotator agreement machine learning deep learning |
title | Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals |
title_full | Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals |
title_fullStr | Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals |
title_full_unstemmed | Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals |
title_short | Systematic Design and Evaluation of a Citation Function Classification Scheme in Indonesian Journals |
title_sort | systematic design and evaluation of a citation function classification scheme in indonesian journals |
topic | citation function classification scheme annotator agreement machine learning deep learning |
url | https://www.mdpi.com/2304-6775/9/3/27 |
work_keys_str_mv | AT yaniasihyaniasih systematicdesignandevaluationofacitationfunctionclassificationschemeinindonesianjournals AT indrabudi systematicdesignandevaluationofacitationfunctionclassificationschemeinindonesianjournals |