Continuous k nearest neighbor queries over spatial–textual data streams (abbreviated as CkQST) are the core operations of numerous location-based publish/subscribe systems. Such a system is usually subscribed with millions of CkQST and evaluated simultaneously whenever new objects arrive and old objects expire. To efficiently evaluate CkQST, we extend a quadtree with an ordered, inverted index as the spatial–textual index for subscribed queries to match the incoming objects, and exploit it with three key techniques. (1) A memory-based cost model is proposed to find the optimal quadtree nodes covering the spatial search range of CkQST, which minimize the cost for searching and updating the index. (2) An adaptive block-based ordered, inverted index is proposed to organize the keywords of CkQST, which adaptively arranges queries in spatial nodes and allows the objects containing common keywords to be processed in a batch with a shared scan, and hence a significant performance gain. (3) A cost-based k-skyband technique is proposed to judiciously determine an optimal search range for CkQST according to the workload of objects, to reduce the re-evaluation cost due to the expiration of objects. The experiments on real-world and synthetic datasets demonstrate that our proposed techniques can efficiently evaluate CkQST.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited