Geographic and Spatial Data Mining (slides).
Michael May, Department Knowledge Discovery,
Fraunhofer Institute of Autonomous Intelligent Systems, Sankt Augustin, Germany.
Abstract. The widespread use of ubiquitous and mobile technologies such as sensor networks, GPS, mobile phones and RFID, as well as the recent success of Google Earth lead to a situation where more and more data mining applications will have to deal with non-trivial problems of spatio-temporal data analysis. Applications range from telecommunication, retail and market research to scientific applications from ecology or epidemiology.
Despite the importance, standard data mining tools and methods cannot adequately deal with spatial information. Consequently, important information is thrown away, leading to non-optimal results. The last years have seen several lines of research that try to change this situation. Various classes of data mining algorithms - e.g. clustering, association rules, decision trees, subgroup discovery - have been upgraded to handle geographic objects such as lines, points and polygons and their spatial relationships. Nicely complementing classical approaches that have been pioneered in geostatistics (e.g. Kriging, Point Pattern Analysis), those approaches are often rooted in some form of Multi-Relational Data Mining.
In this tutorial, we will first clarify the various data types relevant for geographic data mining and work out the specific characteristics and challenges of geographic data. Next, we discuss several examples of algorithms that take advantage of these data types. Finally, we present a wide range of applications to illustrate the potential, successes and shortcomings of current Spatial Data Mining approaches. We conclude by pointing out some future challenges and directions.