Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset

The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limi...

Full description

Bibliographic Details
Main Authors: Sam, Lee Zhi, Maarof, Mohd. Aizaini, Selamat, Ali, Shamsuddin, Siti Mariyam
Format: Conference or Workshop Item
Published: 2007
Subjects:
_version_ 1796855376754245632
author Sam, Lee Zhi
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
author_facet Sam, Lee Zhi
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
author_sort Sam, Lee Zhi
collection ePrints
description The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired. This paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. We found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability.
first_indexed 2024-03-05T18:27:43Z
format Conference or Workshop Item
id utm.eprints-14359
institution Universiti Teknologi Malaysia - ePrints
last_indexed 2024-03-05T18:27:43Z
publishDate 2007
record_format dspace
spelling utm.eprints-143592017-09-18T07:44:39Z http://eprints.utm.my/14359/ Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset Sam, Lee Zhi Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam QA75 Electronic computers. Computer science The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired. This paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. We found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability. 2007 Conference or Workshop Item PeerReviewed Sam, Lee Zhi and Maarof, Mohd. Aizaini and Selamat, Ali and Shamsuddin, Siti Mariyam (2007) Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset. In: Postgraduate Annual Research Seminar (PARS’ 07). , 2007, UTM.
spellingShingle QA75 Electronic computers. Computer science
Sam, Lee Zhi
Maarof, Mohd. Aizaini
Selamat, Ali
Shamsuddin, Siti Mariyam
Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title_full Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title_fullStr Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title_full_unstemmed Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title_short Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
title_sort pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
topic QA75 Electronic computers. Computer science
work_keys_str_mv AT samleezhi pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset
AT maarofmohdaizaini pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset
AT selamatali pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset
AT shamsuddinsitimariyam pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset