SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, w...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universitas Diponegoro
2023-04-01
|
Series: | Media Statistika |
Subjects: | |
Online Access: | https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412 |
_version_ | 1797394884865294336 |
---|---|
author | Titin Siswantining Stanley Pratama Devvi Sarwinda |
author_facet | Titin Siswantining Stanley Pratama Devvi Sarwinda |
author_sort | Titin Siswantining |
collection | DOAJ |
description | Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences. |
first_indexed | 2024-03-09T00:26:14Z |
format | Article |
id | doaj.art-566fef42131641e4bc4ec37ca69dfdc2 |
institution | Directory Open Access Journal |
issn | 1979-3693 2477-0647 |
language | English |
last_indexed | 2024-03-09T00:26:14Z |
publishDate | 2023-04-01 |
publisher | Universitas Diponegoro |
record_format | Article |
series | Media Statistika |
spelling | doaj.art-566fef42131641e4bc4ec37ca69dfdc22023-12-12T02:27:52ZengUniversitas DiponegoroMedia Statistika1979-36932477-06472023-04-0115212913810.14710/medstat.15.2.129-13821857SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNITTitin Siswantining0https://orcid.org/0000-0001-5160-0020Stanley Pratama1Devvi Sarwinda2https://orcid.org/0000-0001-8644-2560Departemen Matematika, Universitas Indonesia, IndonesiaDepartment of Mathematics, Universitas Indonesia, IndonesiaDepartment of Mathematics, Universitas Indonesia, IndonesiaParaphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412natural language processingnatural language sentence matchingrecurrent neural network |
spellingShingle | Titin Siswantining Stanley Pratama Devvi Sarwinda SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT Media Statistika natural language processing natural language sentence matching recurrent neural network |
title | SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT |
title_full | SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT |
title_fullStr | SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT |
title_full_unstemmed | SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT |
title_short | SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT |
title_sort | spratama model for indonesian paraphrase detection using bidirectional long short term memory and bidirectional gated recurrent unit |
topic | natural language processing natural language sentence matching recurrent neural network |
url | https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412 |
work_keys_str_mv | AT titinsiswantining spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit AT stanleypratama spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit AT devvisarwinda spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit |