Continuos Search Queries

1 minute read

Answering bounded continuous search queries in the world wide web by Kukulenz and Ntoulas.

This article present the application of optimal stopping theory to the monitoring of web sites using user-defined keywords - also known as continuous querying.

Problem The goal is to return the estimated k best elements, considering the trade-off between freshness and relevance. The main challenges are: (i) the potentially relevant results within a time period are not known in advantage, and (ii) it is not known when the relevant results will appear. This problem can be treated as finding the top-k-results in financial mathematics, or the 'Secretary Selection Problem'.

Solution A well known strategy is to observe a number of candidates without choosing them, and selecting - after this observation period - the first result, leading a higher ranking/utility as the maximal ranking observed so far.

Selecting k candidates in the stream of documents ,therefore ,means considering k observation periods. The main problem is to find the optimal observation times $$t_1, ... t_k$$.

[kukulenz2007] Kukulenz, Dirk and Ntoulas, Alexandros (2007). ''Answering bounded continuous search queries in the world wide web'', WWW '07: Proceedings of the 16th international conference on World Wide Web, ISBN: 978-1-59593-654-7, ACM, pages 551--560