SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT

Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, w...

Full description

Bibliographic Details
Main Authors: Titin Siswantining, Stanley Pratama, Devvi Sarwinda
Format: Article
Language:English
Published: Universitas Diponegoro 2023-04-01
Series:Media Statistika
Subjects:
Online Access:https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412
_version_ 1797394884865294336
author Titin Siswantining
Stanley Pratama
Devvi Sarwinda
author_facet Titin Siswantining
Stanley Pratama
Devvi Sarwinda
author_sort Titin Siswantining
collection DOAJ
description Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.
first_indexed 2024-03-09T00:26:14Z
format Article
id doaj.art-566fef42131641e4bc4ec37ca69dfdc2
institution Directory Open Access Journal
issn 1979-3693
2477-0647
language English
last_indexed 2024-03-09T00:26:14Z
publishDate 2023-04-01
publisher Universitas Diponegoro
record_format Article
series Media Statistika
spelling doaj.art-566fef42131641e4bc4ec37ca69dfdc22023-12-12T02:27:52ZengUniversitas DiponegoroMedia Statistika1979-36932477-06472023-04-0115212913810.14710/medstat.15.2.129-13821857SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNITTitin Siswantining0https://orcid.org/0000-0001-5160-0020Stanley Pratama1Devvi Sarwinda2https://orcid.org/0000-0001-8644-2560Departemen Matematika, Universitas Indonesia, IndonesiaDepartment of Mathematics, Universitas Indonesia, IndonesiaDepartment of Mathematics, Universitas Indonesia, IndonesiaParaphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412natural language processingnatural language sentence matchingrecurrent neural network
spellingShingle Titin Siswantining
Stanley Pratama
Devvi Sarwinda
SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
Media Statistika
natural language processing
natural language sentence matching
recurrent neural network
title SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
title_full SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
title_fullStr SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
title_full_unstemmed SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
title_short SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT
title_sort spratama model for indonesian paraphrase detection using bidirectional long short term memory and bidirectional gated recurrent unit
topic natural language processing
natural language sentence matching
recurrent neural network
url https://ejournal.undip.ac.id/index.php/media_statistika/article/view/46412
work_keys_str_mv AT titinsiswantining spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit
AT stanleypratama spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit
AT devvisarwinda spratamamodelforindonesianparaphrasedetectionusingbidirectionallongshorttermmemoryandbidirectionalgatedrecurrentunit