Volume 3 - Issue 3
Efficient crawling strategy for topical web information
Abstract
Efficient topical crawling strategy is essential for topic-specific search engines. Most existing crawling strategy only focused on either precision of the collected pages or the crawling speed. In this paper, an efficient crawling strategy for topical web information is introduced, which uses link structure of pages as well as semantic similarity of pages. The novelty of our method is that it is able to effectively extract pages with a high degree of relevancy to a specific topic by incorporating word similarity and ontology, and further more can achieve a respectable coverage at a rapid rate. Evaluation showed that our approach has promising results.
Paper Details
PaperID: 48549103181
Author's Name: Lin, K.
Volume: Volume 3
Issues: Issue 3
Keywords: Crawling strategy, Ontology, Word similarity matrix, WordNet
Year: 2008
Month: June
Pages: 843-850