Research Repository

An efficient approach to suggesting topically related web queries using hidden topic model

Li, Lin, Xu, Guandong, Yang, Zhenglu, Dolog, Peter, Zhang, Yanchun and Kitsuregawa, Masaru (2012) An efficient approach to suggesting topically related web queries using hidden topic model. World Wide Web. pp. 1-25. ISSN 1386-145X (print) 1573-1413 (online)

Full text for this resource is not available from the Research Repository.

Abstract

Keyword-based Web search is a widely used approach for locating information on the Web. However, Web users usually suffer from the difficulties of organizing and formulating appropriate input queries due to the lack of sufficient domain knowledge, which greatly affects the search performance. An effective tool to meet the information needs of a search engine user is to suggestWeb queries that are topically related to their initial inquiry. Accurately computing query-to-query similarity scores is a key to improve the quality of these suggestions. Because of the short lengths of queries, traditional pseudo-relevance or implicit-relevance based approaches expand the expression of the queries for the similarity computation. They explicitly use a search engine as a complementary source and directly extractadditional features (such as terms or URLs) from the top-listed or clicked search results. In this paper, we propose a novel approach by utilizing the hidden topic as an expandable feature. This has two steps. In the offline model-learning step, a hidden topic model is trained, and for each candidate query, its posterior distribution over the hidden topic space is determined to re-express the query instead of the lexical expression. In the online query suggestion step, after inferring the topic distribution for an input query in a similar way, we then calculate the similarity between candidate queries and the input query in terms of their corresponding topic distributions; and produce a suggestion list of candidate queries based on the similarity scores. Our experimental results on two real data sets show that the hidden topic based suggestion is much more efficient than the traditional term or URL based approach, and is effective in finding topically related queries for suggestion.

Item Type: Article
Uncontrolled Keywords: ResPubID24861, query suggestion, hidden topic model, latent Dirichlet allocation, Web search engine, vector representation, training dataset, keyword search, keyword-based queries, Web archive, URL model, human experts
Subjects: FOR Classification > 0807 Library and Information Studies
Faculty/School/Research Centre/Department > Centre for Applied Informatics
Faculty/School/Research Centre/Department > School of Engineering and Science
Depositing User: VUIR
Date Deposited: 15 Mar 2013 02:45
Last Modified: 21 Aug 2014 02:30
URI: http://vuir.vu.edu.au/id/eprint/10361
DOI: https://doi.org/10.1007/s11280-011-0151-3
ePrint Statistics: View download statistics for this item
Citations in Scopus: 20 - View on Scopus

Repository staff only

View Item View Item

Search Google Scholar