Volume 8 - Issue 7
B-bit normalized compression distance
Abstract
We propose a novel approach "6-bit Normalized compression distance" for detecting similarity between two objects based on normalized compression distance and Kolmogorov complexity. Only storing 6 bits of each byte value of object (e.g., 6=1 or 2), 6-bit NCD can gain advantages in terms of computational efficiency and storage space. To evidence generality and effectiveness of 6-bit NCD, We analyze the feasibility of 6-bit NCD, including the variance of estimator and examples to compare whole mitochondrial genomes, famous speeches and clustering the tree of 6-bit NCD using standard compression programs like zlib. Experiments show that 6-bit NCD method will obtain the similar results under change of 6 at the expense of a little accuracy.
Paper Details
PaperID: 84861446126
Author's Name: Xiao, Z., Yuan, X.
Volume: Volume 8
Issues: Issue 7
Keywords: 5-bit normalized compression distance, Dissimilarity distance, Kolmogorov complexity
Year: 2012
Month: April
Pages: 2701 - 2707