Mining Text Data Using Side Information |
Author(s): |
| Gajalakshmi. S , PPG Institute Of Technology; Timotta. C, PPG Institute Of Technology; Gayathiri. D, PPG Institute Of Technology; Vinisha. K, PPG Institute Of Technology |
Keywords: |
| Mining Text Data, Clustering |
Abstract |
|
Text mining applications consists of text along with side information which can be also called as attributes ,they are of different kinds such as document links, history of the document, user web logs, access behavior of various users. These side information comes along with the text documents and they can also be non-textual attributes is available with text documents. Such side information embedded with the text document consists of tremendous information which is used in clustering process is embedded into the text document. The estimation of the relative information .Importance of side information may be difficult when some of them are noisy. In that case, it is a risk to combine side information in the mining process; they can even add more noise into the process or improve the quality of text representation for into mining process. Hence a principled method of process is needed to perform a mining process to increase the advantages of side information. The algorithm named COATES is designed with the combination of classical partitioning algorithms under probabilistic models in the form of creating an effective clustering process. The process is then extended to an classification purpose with the algorithm named COLT by presenting experimental results with number of real data sets to show the advantages of using such process. |
Other Details |
|
Paper ID: IJSRDV3I1060 Published in: Volume : 3, Issue : 1 Publication Date: 01/04/2015 Page(s): 653-655 |
Article Preview |
|
|
|
|
