ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes pe...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Informatics Department, Engineering Faculty
2022-01-01
|
Series: | Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi |
Subjects: | |
Online Access: | https://kursorjournal.org/index.php/kursor/article/view/247 |
_version_ | 1827866361095782400 |
---|---|
author | Satyawan Agung Nugroho Fitra A Bachtiar Randy Cahya Wihandika |
author_facet | Satyawan Agung Nugroho Fitra A Bachtiar Randy Cahya Wihandika |
author_sort | Satyawan Agung Nugroho |
collection | DOAJ |
description |
Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes people’s opinion towards a product. However, the large number of social media users is currently making the process of finding topics from social media difficult, so computer assistance is needed. One method that can be used is Latent Dirichlet Allocation (LDA). LDA is a good method for extracting topics, but the drawback is that sometimes the topics are incomprehensible. To cover up the drawback, TF-IDF feature selection method is used so that less important words can be skipped so LDA can generate a better topic. The best hyperparameter values ​​obtained were 10 iterations, 10 topics, α and β values consecutively 0,1 and 0,01. The best feature selection percentile value is 90. This value is used to find the threshold that can be used as the lower limit of the TF-IDF value of each word so that the word with greater TF-IDF value can be used as feature.
|
first_indexed | 2024-03-12T15:02:47Z |
format | Article |
id | doaj.art-c1793b167d8b4994aa08144e74464861 |
institution | Directory Open Access Journal |
issn | 0216-0544 2301-6914 |
language | English |
last_indexed | 2024-03-12T15:02:47Z |
publishDate | 2022-01-01 |
publisher | Informatics Department, Engineering Faculty |
record_format | Article |
series | Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi |
spelling | doaj.art-c1793b167d8b4994aa08144e744648612023-08-13T20:42:19ZengInformatics Department, Engineering FacultyJurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi0216-05442301-69142022-01-0111210.21107/kursor.v11i2.247ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)Satyawan Agung Nugroho0Fitra A BachtiarRandy Cahya WihandikaUniversitas Brawijaya Social media is a common thing that people use. Posts or comments found on social media describe someone’s feelings and opinions so there have to be important topics that can be extracted from social media. In the e-commerce field, topic is an interesting thing to know because it can describes people’s opinion towards a product. However, the large number of social media users is currently making the process of finding topics from social media difficult, so computer assistance is needed. One method that can be used is Latent Dirichlet Allocation (LDA). LDA is a good method for extracting topics, but the drawback is that sometimes the topics are incomprehensible. To cover up the drawback, TF-IDF feature selection method is used so that less important words can be skipped so LDA can generate a better topic. The best hyperparameter values ​​obtained were 10 iterations, 10 topics, α and β values consecutively 0,1 and 0,01. The best feature selection percentile value is 90. This value is used to find the threshold that can be used as the lower limit of the TF-IDF value of each word so that the word with greater TF-IDF value can be used as feature. https://kursorjournal.org/index.php/kursor/article/view/247Aspect ExtractionLatent Dirichlet AllocationPerplexityTerm Frequency - Inverse Document FrequencyTopic Modelling |
spellingShingle | Satyawan Agung Nugroho Fitra A Bachtiar Randy Cahya Wihandika ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Jurnal Ilmiah Kursor: Menuju Solusi Teknologi Informasi Aspect Extraction Latent Dirichlet Allocation Perplexity Term Frequency - Inverse Document Frequency Topic Modelling |
title | ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) |
title_full | ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) |
title_fullStr | ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) |
title_full_unstemmed | ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) |
title_short | ASPECT EXTRACTION IN E-COMMERCE USING LATENT DIRICHLET ALLOCATION (LDA) WITH TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) |
title_sort | aspect extraction in e commerce using latent dirichlet allocation lda with term frequency inverse document frequency tf idf |
topic | Aspect Extraction Latent Dirichlet Allocation Perplexity Term Frequency - Inverse Document Frequency Topic Modelling |
url | https://kursorjournal.org/index.php/kursor/article/view/247 |
work_keys_str_mv | AT satyawanagungnugroho aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf AT fitraabachtiar aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf AT randycahyawihandika aspectextractioninecommerceusinglatentdirichletallocationldawithtermfrequencyinversedocumentfrequencytfidf |