Challenges and Resources for Evaluating Geographical IR

1 minute read

Martins, Bruno and Silva, Mário J. and Chaves, Marcirio Silveira (2005). ''Challenges and resources for evaluating geographical IR'', GIR '05: Proceedings of the 2005 workshop on Geographic information retrieval, ISBN: 1-59593-165-1, ACM, pages 65--69

The paper elaborates on the challenges required to develop accurate methods for evaluating geographic tags like

building geographic ontologies (the authors use this term for hierarchical gazetteers)
handling geographic references in text
assigning scopes (country, city, ...)
ranking documents according to geographic relevance (the article provides a number of suggestions and references for possible ranking criteria like (i) overlap, (ii) euclidean distance, (iii) topological distance (connectivity), (iv) semantic structures, (v) hybrid methods
building user interfaces for geo-ir

In the next section the authors review common IR measures like precision, recall, accuracy (a=(tp+tn)/(tp+fp+fn+tn) and error, suggest pairwise testing and the Wilcoxon signed rank test to rank geo-tags.

They suggest precision and recall with a fixed rank cutoffs (top five documents) or fixed recall points (e.g. precision at 20% recall) for ranked retrieval.

The paper also cites experiments showing that laboratory experiments and IR metrics do not measure user satisfaction accurately. Previous work has shown that already five users detect approximately 80% of usability problems (31% per user).

Reasoning on Gazetteers and Geo-Ontologies The article provides some ideas how to reason using gazetteers by facilitating hierarchical/semantic relations, use the Euclidean distance and extending spatial reasoning e.g. through Voronoi polygons or through spatial indexes based on uniform grids.

Share on

Twitter Facebook LinkedIn

Albert Weichselbraun

Challenges and Resources for Evaluating Geographical IR

Share on

You may also enjoy

Big, Linked Geospatial Data and Its Application in Earth Observation

Employment relations: a data driven analysis of job markets using online job boards and online professional networks

Suffix array

Dynamic feature scaling for online learning of binary classifiers