Semantic Analysis of Tourism Vocabulary Based on Similar Words Calculation

Hui PENG, Hong-yan PAN


Tourism data mining is the process of abstracting data relations from a huge number of tourism data. It can discover the implicit knowledge and rules which hidden in data. The discovery of the semantic relation between tourism words is the important content in tourism data mining. The classical similar words calculation model skip-gram in natural language processing area is introduced in the paper. The part of speech is not considered in skip-gram so when the similar words located closely in a sentence the model cannot identify them accurately. So we provide the model of skip-gram with Chinese Part of Speech—POS-skip-gram. With the help of this model and the tourism data from elong and ctrip website, we have established the semantic relations map of tourism words. The map can be the basis of tourism data mining.


Tourism data mining, Similar words, Semantics, Semantic diagram


