A text classification method based on the improved CR-tree and the weighted association rule
Classification based on association rules is a common and easily understand algorithm for text classification. To improve its classification accuracy, the key is to generate more effective rules. Sometimes, it will overdraw the role of some training texts. To avoiding this and generate more effective rules, this paper defines the Classification Path and proposes a new text classifier (CP-tree) based on CR-tree. When a new text is coming, to avoid overdrawing, association rules are generated through scanning Classification Path; in addition, to make the role of association rules in class prediction more reasonable, the association rules and weight for the new text are obtained not only according to the training texts, but also the new texts. Experimental results show that the accuracy of text classification is improved with the algorithm based on CP-tree.
Author's Name: Wang, Y., Zhao, Y., Yuan, F.
Volume: Volume 5
Issues: Issue 6
Keywords: Association Rules, Classification Path, CP-tree, Text Classification, Weighted Rules