Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset
The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limi...
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2007
|
Subjects: |
_version_ | 1796855376754245632 |
---|---|
author | Sam, Lee Zhi Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam |
author_facet | Sam, Lee Zhi Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam |
author_sort | Sam, Lee Zhi |
collection | ePrints |
description | The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired. This paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. We found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability. |
first_indexed | 2024-03-05T18:27:43Z |
format | Conference or Workshop Item |
id | utm.eprints-14359 |
institution | Universiti Teknologi Malaysia - ePrints |
last_indexed | 2024-03-05T18:27:43Z |
publishDate | 2007 |
record_format | dspace |
spelling | utm.eprints-143592017-09-18T07:44:39Z http://eprints.utm.my/14359/ Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset Sam, Lee Zhi Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam QA75 Electronic computers. Computer science The fast growth of internet make objectionable web content such as pornography and violence easily explore to web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator blocking and Platform for Internet Content Selection checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired. This paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. We found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability. 2007 Conference or Workshop Item PeerReviewed Sam, Lee Zhi and Maarof, Mohd. Aizaini and Selamat, Ali and Shamsuddin, Siti Mariyam (2007) Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset. In: Postgraduate Annual Research Seminar (PARS’ 07). , 2007, UTM. |
spellingShingle | QA75 Electronic computers. Computer science Sam, Lee Zhi Maarof, Mohd. Aizaini Selamat, Ali Shamsuddin, Siti Mariyam Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title | Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title_full | Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title_fullStr | Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title_full_unstemmed | Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title_short | Pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
title_sort | pornography web pages classification with textual content analysis using entropy term weighting scheme for small class dataset |
topic | QA75 Electronic computers. Computer science |
work_keys_str_mv | AT samleezhi pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset AT maarofmohdaizaini pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset AT selamatali pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset AT shamsuddinsitimariyam pornographywebpagesclassificationwithtextualcontentanalysisusingentropytermweightingschemeforsmallclassdataset |