Fuzzy k-mode clustering a Document Summarization System for Document Retrieval
Abstract
Document retrieval has received much attention by the data mining community since the growth of World Wide Web and tremendous information storage happened in IT field. Various algorithms are developed for document retrieval using different document features and similarity measure. But, computational time incurred for the algorithm is increasing whenever a new set of feature and similarity measures are included in the retrieval process. The computational effort should be minimized without much affecting the effectiveness of the results. This can be new direction of research happening in the data mining community recently. Accordingly, summarization system is used for shorten the text document and the retrieval process is done based on the contents presented in the summary. In our work, clustering process is also utilized along with summarization system for further reducing the complexity in retrieval process. Accordingly, fuzzy k-mode clustering and keyword-based features are utilized here for the document retrieval algorithm. The advantage of this method is that the matching can be done only with the representative keywords identified from the clustering algorithm. Finally, the experimentation is done with 100 documents and the results are compared with existing algorithm using precision, recall and F-measure.
Key Words: Document retrieval, World Wide Web, fuzzy k-mode clustering, Summarization, Similarity measure.
Downloads
Published
How to Cite
Issue
Section
License
International Journal of Engineering Technology and Computer Research (IJETCR) by Articles is licensed under a Creative Commons Attribution 4.0 International License.