Effective Web Page Crawler
The World Wide Web (WWW) has grown from a few thousand pages in 1993 to more than eight billion pages at present. Due to this explosion in size, web search engines are becoming increasingly important as the primary means of locating relevant information. This research aims to build a crawler that cr...
প্রধান লেখক: | , |
---|---|
বিন্যাস: | প্রবন্ধ |
ভাষা: | English |
প্রকাশিত: |
Unviversity of Technology- Iraq
2011-02-01
|
মালা: | Engineering and Technology Journal |
বিষয়গুলি: | |
অনলাইন ব্যবহার করুন: | https://etj.uotechnology.edu.iq/article_26186_36b7272baba534e2fd03611087c6e7c5.pdf |
সংক্ষিপ্ত: | The World Wide Web (WWW) has grown from a few thousand pages in 1993 to more than eight billion pages at present. Due to this explosion in size, web search engines are becoming increasingly important as the primary means of locating relevant information. This research aims to build a crawler that crawls the most important web pages, a crawling system has been built which consists of three main techniques. The first is Best-First Technique which is used to select the most important page. The second is Distributed Crawling Technique which based on UbiCrawler. It is used to distribute the URLs of the selected web pages to several machines. And the third is Duplicated Pages Detecting Technique by using a proposed document fingerprint algorithm. |
---|---|
আইএসএসএন: | 1681-6900 2412-0758 |