Web Crawling by Means of Multithreading and Google Numerical Weighting Technique |
Author(s): |
| Amrita Banjare , Dr. C.V. Raman University, Bilaspur; Rohit Miri, Dr. C.V. Raman University, Bilaspur; khushboo sharma, Dr. C.V. Raman University, Bilaspur |
Keywords: |
| WWW, Bot, Spider, PR, BFS |
Abstract |
|
A web crawler (also known as a spider or a robot) is an organism for the volume downloading of web pages. Web spidering may emerge to be simply an application of BFS (Breadth First Search) technique, the genuineness is that there are numerous challenges ranging from systems concerns like organizing very large data structures. For such a massive data structures, it became a substantial challenge for single process crawlers. Web crawlers are meant for various purposes. Most importantly, they are one of the main components of search engines, SEO (Search Engine Optimization). Henceforth compelling algorithms are in demand for efficient web crawling. As a consequence it has become very important to make effectual crawling procedure, so as to finish crawling process in a prudent amount of time. There are a lot of programs out there for web crawling but it required a Web Crawler that allowed trouble-free customization. In this paper we have proposed an effectual crawling mechanism in which integration of multithreaded crawler and Google Numerical weighting technique has been done. Numerical weight of webpage is a “vote†by all other pages on the web. By applying PR (Page Rank) it should bring high quality documents so that the user gets the required pertinent information within satisfactory time. |
Other Details |
|
Paper ID: IJSRDV3I40467 Published in: Volume : 3, Issue : 4 Publication Date: 01/07/2015 Page(s): 746-751 |
Article Preview |
|
|
|
|
