Volume 5 - Issue 3
Research on text clustering algorithm based on agglomerative hierarchical clustering
Abstract
Text clustering is one of the difficult and hot research fields in the internet search engine research. Using agglomerative hierarchical clustering techniques, a new text clustering algorithm is presented. Firstly, texts are preprocessed to satisfy succeed process. Then, the paper analyzes common K-means clustering algorithm and improves the algorithm through improving selection methods of initial cluster centers, K-means algorithm principle and K-means algorithm flow to improve the deficiency that the K-means algorithm is very sensitive to the initial cluster center and the isolated point text. The experimental results indicate that the improved algorithm has a higher accuracy compared with the original algorithm, and has a better stability.
Paper Details
PaperID: 70350241344
Author's Name: Li, X.
Volume: Volume 5
Issues: Issue 3
Keywords: Agglomerative Hierarchical Clustering Initial Cluster Centers, K-means Clustering, Text Clustering
Year: 2009
Month: June
Pages: 1081-1087