High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Novel Database-Centric Framework for Incremental Information Extraction


Mr. P. Sasikumar , Selvam College of Technology, Namakkal; K. Keerthana, Selvam College of Technology, Namakkal


Information extraction, natural language processing, framework.


Information extraction (IE) has been an active research area that seeks techniques to uncover information from a large collection of text. IE is the task of automatically extracting structured information from unstructured and/or semi structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in document processing like automatic annotation and content extraction could be seen as information extraction. Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. In this project a new paradigm for information extraction. In this extraction framework, intermediate output of each text processing component is stored so that only the improved component has to be deployed to the entire corpus. Extraction is then performed on both the previously processed data from the unchanged components as well as the updated data generated by the improved component. Performing such kind of incremental extraction can result in a tremendous reduction of processing time and there is a mechanism to generate extraction queries from both labeled and unlabeled data. Query generation is critical so that casual users can specify their information needs without learning the query language.

Other Details

Paper ID: IJSRDV1I4005
Published in: Volume : 1, Issue : 4
Publication Date: 01/07/2013
Page(s): 827-830

Article Preview

Download Article