High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Survey On Discovering Deep Web Interfaces Using Data Mining

Author(s):

Ms.Roshana R. Bangar , SPCOE.Dumbarwadi ; Mr.Kahate S.A., SPCOE.Dumbarwadi ; Mr.Deokate G.D., SPCOE.Dumbarwadi

Keywords:

Deep Web, Two-Stage Crawler, Feature Selection, Ranking, Adaptive Learning

Abstract

The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. It represents a large portion of the structured data on the Web; accessing Deep-Web content has been a long-standing challenge for the database community. Deep web crawling is fundamental problem faced by web crawlers that has profound effect on search engine efficiency. Recent study shows that nearly 96% of data over internet is hidden i.e. not found to search engines. The challenge imposed on search engines is to retrieve hidden web data at low cost. This system uses a machine learning approach that is completely automatic, highly scalable, and very efficient, that helps to improve data retrieval at reduced cost. This system uses focused crawling strategy for retrieving accurate results related to query and selects only relevant links according to their similarity with respect to query. The algorithm used in this system efficiently selects only possible candidates rather than searching whole search space for inclusion in too ur web search index.

Other Details

Paper ID: IJSRDV3I110162
Published in: Volume : 3, Issue : 11
Publication Date: 01/02/2016
Page(s): 446-447

Article Preview

Download Article