INCREMENTAL LEARNING UNTUK OPINION MINING PADA TWEET BERBAHASA INDONESIA MENGGUNAKAN DATA STREAM TWITTER

Twitter is a micro-blogging service that grow very fast. Twitter users often write their opinion using their account. Opinion mining is one computation method for extracting opinion from a tweet. Accuracy is an evaluation tool in opinion mining. Accuracy in opinion mining can be improved by increasi...

Full description

Bibliographic Details
Main Authors: , FAJRI WIRYAWAN, , Drs. Edi Winarko, M.Sc., PhD.
Format: Thesis
Published: [Yogyakarta] : Universitas Gadjah Mada 2014
Subjects:
ETD
Description
Summary:Twitter is a micro-blogging service that grow very fast. Twitter users often write their opinion using their account. Opinion mining is one computation method for extracting opinion from a tweet. Accuracy is an evaluation tool in opinion mining. Accuracy in opinion mining can be improved by increasing corpus size. In opinion mining for a tweet, corpus size can be increased using Twitter data stream. Incremental learning can be used to process data stream. Processing data stream using incremental learning are expected to improve the accuracy so resulting self-improvement. On this research, two incremental learning method will be compared, full concept memory and full concept memory with accuracy comparison. For each method, Multinomial Naive Bayes, Binarized Multinomial Naive Bayes, and Multi-variate Bernoulli Naive Bayes will be used. Each method will upgrade the concept description after n data is received. This research will use n = 1, 25, 50, 75, 100, 250, 500, and 750. Comparison will be done by using accuracy for 250 test data and process time for 15.000 data stream as evaluation tool. Comparison of incremental learning method resulting that both methods of incremental learning produce relatively increasing accuracy. Binarized Multinomial Naive Bayes produce best accuracy in both methods of incremental learning and all n value. The best accuracy is 80.4% using Binarized Multinomial Naive Bayes, full concept memory with accuracy comparison, and n = 25. Process time for Multinomial Naive Bayes and Binarized Multinomial Naive Bayes in both methods of incremental learning and all n value is not very different except for n = 1. Multi-variate Bernoulli Naive Bayes produce the longest process time for both methods of incremental learning and all n value. The longest process time is about 36 hours using Multi-variate Bernoulli Naive Bayes, full concept memory with accuracy comparison, and n = 1.