Automatic Acquisition and Semantic Annotation of Web Tourism Information

Hui PENG, Wen-qi QU


Data collection and semantic annotation is often the basic of information processing such as semantic relation analysis of data, big data mining and semantic information search A method which collects data from tourism web site and annotates these data with semantic tags automatically is promoted in this paper. The crawler which collects data from web site automatically is introduced firstly. Then the Chinese word segmentation tool and a classic key word extraction algorithm TF/IDF are introduced. With the help of a crawler, we collection tourism information about 247 sight spots in Beijing and 4198 sight spots in other area of China from the web sites of elong and ctrip. Then with the help of the ICTCLAS and TF/IDF, we abstract keywords from the information as semantic tags to annotate the sight spots.


Tourism information, Word segmentation, Keywords, Semantics, Semantic annotation


Full Text:



  • There are currently no refbacks.