Scaling Geospatial Searches in Large Spatial Databases Dissertation

(2011). Scaling Geospatial Searches in Large Spatial Databases . 10.25148/etd.FI11120810

thesis or dissertation chair

authors

  • Cary, Ariel

abstract

  • Modern geographical databases store a rich set of aspatial attributes in addition to geographic data. Retrieving spatial records constrained on spatial and aspatial attributes provides users the ability to perform more interesting spatial analyses via composite spatial searches; e.g., in a real estate database, "Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000". Efficient processing of such composite searches requires combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by non-spatial filter, or viceversa), which can incur large performance overheads. On the other hand, the amount of geolocation data in databases is rapidly increasing due in part to advances in geolocation technologies (e.g., GPS- enabled mobile devices) that allow to associate location data to nearly every object or event. Hence, practical spatial databases may face data ingestion challenges of large data volumes. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre- processing task) can be scaled in MapReduce – a well-adopted parallel programming model, developed by Google, for data intensive problems. Close to linear scalability was observed in index construction tasks over large spatial datasets. Subsequently, we develop novel techniques for simultaneously indexing spatial with textual and numeric data to process k-nearest neighbor searches with aspatial Boolean selection constraints. In particular, numeric ranges are compactly encoded and explicitly indexed. Experimental evaluations with real spatial databases showed query response times within acceptable ranges for interactive search systems.

publication date

  • November 8, 2011

keywords

  • MapReduce
  • Spatial databases
  • indexing.
  • scalability
  • searches

Digital Object Identifier (DOI)