Co-Clustering based Cross-Domain Text Classification Algorithm with Semantic ANALYSIS for Wikipedia
In order to construct reliable as well as accurate classifiers, traditional techniques to document classification requires labeled data. Conversely, labeled data are infrequently accessible, as well generally very costly to acquire. Given a learning task for which training data are not existing, abundant labeled data probably would exist for a diverse on the other hand related domain. One would like to utilize the related labeled data as supporting information in order to get done the classification task in the target domain. In these days, so as to facilitate proficient learning techniques while supporting data follow a diverse probability distribution, the exemplar of transfer learning has been presented. A co-clustering based classification technique was presented for handling cross-domain text classification. The conception behind this technique is prolonged by means of forming the latent semantic relationship amid the two domains explicit. This goal is attained by means of Wikipedia. As a result, the pathway, which lets propagating labels amid the two domains not just takes common words, on the other hand, as well semantic concepts corresponding to the content of documents. Outcomes express the effectiveness of the semantic-based technique to cross-domain classification by means of utilizing various real data.