Volume 4 - Issue 3
Research of frequent pattern mining from XML data based on heterogeneous XML schema
Abstract
This paper researches frequent pattern mining algorithm based on heterogeneous XML schema. Proposed schemas similarity clustering in advance, and then the corresponding XML data are modeled as labeled ordered trees. This algorithm used rightmost path expansion method, which starts with pattern trees with only one node and the nodes are added only to the rightmost path gradually to generate new pattern trees. The number of candidate patterns is small because of utilizing the information of the frequent patterns discovered in the pervious iteration. To improve mining efficiency, this paper utilizes projected branch technique solving the problem with distinguishing isomorphism at the same time. Finally, a group of XML data is applied to test the performance of the algorithm and the experimental result is compared with other algorithms. Experimental results showed that the algorithm is efficient and feasible.
Paper Details
PaperID: 48549107240
Author's Name: Yang, H., He, Z., Liang, W.
Volume: Volume 4
Issues: Issue 3
Keywords: Data mining, Frequent tree pattern, Ordered tree, XML
Year: 2008
Month: June
Pages: 787 - 794