A Chinese Website Analysis Approach Using Ontology Segmentation and Topic Model

Da-wei LIU, Xue-mei LI, Hai-yang WANG, Wei LIU

Abstract


A website analysis approach based on ontology segmentation and topic model is proposed to represent the website sections and retrieval similar websites. Definitions of concepts and taxonomy are introduced considering both structural and content characteristics. Features are extracted from websites leveraging linkage information and section tags. Topic model is used to perform information retrieval and get relevance scores between websites. The experimental results show that comparing with existing methods, the proposed approach can get more accurate results in website section analysis and searching for similar websites.

Keywords


Website classification, Topic model, LDA, Section analysis, Ontology construction


DOI
10.12783/dtcse/aics2016/8181

Full Text:

PDF

Refbacks

  • There are currently no refbacks.