Network bandwidth utilization based on collaborative web caching using machine learning algorithms in peer-to-peer systems for media web objects

Web caching plays a key role in delivering web items to end users in World Wide Web (WWW). Many benefits can be gathered from caching such as improving the hit rates, alleviating loads on origin servers, and reducing network traffic. Cache size is considered as a limitation of web caching. Furthe...

Full description

Bibliographic Details
Main Author: Mohammed, Waheed Yasin
Format: Thesis
Language:English
Published: 2018
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/76955/1/FSKTM%202018%2065%20-%20IR.pdf
Description
Summary:Web caching plays a key role in delivering web items to end users in World Wide Web (WWW). Many benefits can be gathered from caching such as improving the hit rates, alleviating loads on origin servers, and reducing network traffic. Cache size is considered as a limitation of web caching. Furthermore, retrieving the same media object from the origin server many times consumes the network bandwidth. On the other hand, full caching for media objects is not a practical solution and consumes cache storage in keeping few media objects because of its limited capacity. Moreover, traditional web caching policies such as Least Recently Used (LRU) and Least Frequently Used (LFU) suffer from caching pollution (i.e. media objects that are stored in the cache are not frequently visited, which negatively affects on the performance of web proxy caching). This problem has been addressed in the works of Ali et al. (2012a), Ali et al. (2012b), Julian et al. (2014), and Julian and Sagayaraj (2015). For example, the average improvement of Hit Ratio (HR) in the works of Ali et al. (2012a) and Ali et al. (2012b) achieved by NB-LRU approach over LRU increased by 7.68%. In terms of Byte Hit Ratio (BHR), the average improvement achieved by NB-LRU, NB-LFU approaches over LRU and LFU are 11.65%, 2.88%, respectively. On the other hand, they do not consider the advantages that can be given by applying these approaches in peer-to-peer systems. In this work, intelligent collaborative web caching approaches based on C4.5 decision tree and Naïve Bayes (NB) supervised machine learning algorithms are presented. The proposed approaches take the advantage of structured peer-to-peer systems where peers' caches contents are shared in order to enhance the performance of the web caching policy. The performance of the proposed approaches is evaluated by running simulations on a two datasets that are collected from YemenNet which is the Internet Service Provider (ISP) in Yemen, and IRCache network which is used as a source for dataset in many researches. The results demonstrate that the new proposed approaches improve the performance of LFU and LRU traditional web caching policies in terms of HR, BHR, and Cost Throughput (CT), the results are compared with the most relevant and stateof- the-art web proxy caching policies.