ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes pe...

Full description

Bibliographic Details
Main Authors: Satyawan Agung Nugroho, Fitra A Bachtiar, Randy Cahya Wihandika
Format: Article
Language:English
Published: Informatics Department, Engineering Faculty 2022-01-01
Series:Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi
Subjects:
Online Access:https://kursorjournal.org/index.php/kursor/article/view/247
_version_ 1827866361095782400
author Satyawan Agung Nugroho
Fitra A Bachtiar
Randy Cahya Wihandika
author_facet Satyawan Agung Nugroho
Fitra A Bachtiar
Randy Cahya Wihandika
author_sort Satyawan Agung Nugroho
collection DOAJ
description Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes people’s opinion towards a product. However, the large number of social media users is currently making the process of finding topics from social media difficult, so computer assistance is needed. One method that can be used is Latent Dirichlet Allocation (LDA). LDA is a good method for extracting topics, but the drawback is that sometimes the topics are incomprehensible. To cover up the drawback, TF-IDF feature selection method is used so that less important words can be skipped so LDA can generate a better topic. The best hyperparameter values ​​obtained were 10 iterations, 10 topics, α and β values consecutively 0,1 and 0,01. The best feature selection percentile value is 90. This value is used to find the threshold that can be used as the lower limit of the TF-IDF value of each word so that the word with greater TF-IDF value can be used as feature.
first_indexed 2024-03-12T15:02:47Z
format Article
id doaj.art-c1793b167d8b4994aa08144e74464861
institution Directory Open Access Journal
issn 0216-0544
2301-6914
language English
last_indexed 2024-03-12T15:02:47Z
publishDate 2022-01-01
publisher Informatics Department, Engineering Faculty
record_format Article
series Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi
spelling doaj.art-c1793b167d8b4994aa08144e744648612023-08-13T20:42:19ZengInformatics Department, Engineering FacultyJurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi0216-05442301-69142022-01-0111210.21107/kursor.v11i2.247ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)Satyawan Agung Nugroho0Fitra A BachtiarRandy Cahya WihandikaUniversitas Brawijaya Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes people’s opinion towards a product. However, the large number of social media users is currently making the process of finding topics from social media difficult, so computer assistance is needed. One method that can be used is Latent Dirichlet Allocation (LDA). LDA is a good method for extracting topics, but the drawback is that sometimes the topics are incomprehensible. To cover up the drawback, TF-IDF feature selection method is used so that less important words can be skipped so LDA can generate a better topic. The best hyperparameter values ​​obtained were 10 iterations, 10 topics, α and β values consecutively 0,1 and 0,01. The best feature selection percentile value is 90. This value is used to find the threshold that can be used as the lower limit of the TF-IDF value of each word so that the word with greater TF-IDF value can be used as feature. https://kursorjournal.org/index.php/kursor/article/view/247Aspect ExtractionLatent Dirichlet AllocationPerplexityTerm Frequency - Inverse Document FrequencyTopic Modelling
spellingShingle Satyawan Agung Nugroho
Fitra A Bachtiar
Randy Cahya Wihandika
ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi
Aspect Extraction
Latent Dirichlet Allocation
Perplexity
Term Frequency - Inverse Document Frequency
Topic Modelling
title ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
title_full ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
title_fullStr ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
title_full_unstemmed ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
title_short ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
title_sort aspect extraction in e commerce using latent dirichlet allocation lda with term frequency inverse document frequency tf idf
topic Aspect Extraction
Latent Dirichlet Allocation
Perplexity
Term Frequency - Inverse Document Frequency
Topic Modelling
url https://kursorjournal.org/index.php/kursor/article/view/247
work_keys_str_mv AT satyawanagungnugroho aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf
AT fitraabachtiar aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf
AT randycahyawihandika aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf