Hosted by Carleton University's Geomatics and Cartographic Research Centre (GCRC), this event is a focus on high performance geoprocessing and big data.
This event is a symposium focusing on high performance geoprocessing technologies.
Our deep gratitude to Dr. Fraser Taylor and the GCRC for hosting this event and to LocationTech Tour sponsors for their generous financial support.
Technology change has created an inflection point for geodata. Mobile devices, social media, retail transactions, and more generate a tremendous amount of data. The volume, variety, and velocity of data is increasing. New technologies are being developed to handle the huge amounts of data. The problem is more complex than simply having a big relational database. This event features research and development in this area.
10:00am (30 minutes) - Welcome: Andrew Ross & Dr. Fraser Taylor
10:30am (90 minutes) - Presentations
12:00pm (60 minutes) - Catered Lunch
1:00pm (90 minutes) - Presentations
2:30pm (30 minutes) - Break
3:00 pm (120 minutes) - Workshop
5:00pm - Adjourn
Fast, Distributed Geoprocessing with Scala and GeoTrellis
by Robert Cheetham, CEO, Azavea
What got you hooked on geospatial? For me it was more than just being able to see stuff on a map – it was the ability to transform geographic data in ways that enabled me to see something new, make a better decision or shed new light on some aspect of my environment.
Whether you use GDAL, ArcGIS ModelBuilder, GRASS or IDRISI, we have usually done this type of data transformation with a variety of desktop software tools. So why have these types of capabilities been relatively rare in web and mobile applications? Speed and scalability. It has generally required too much time to calculate a viewshed, combine a pile of raster data into a weighted overlay, compute a watershed or generate slope and aspect from elevation data.
Azavea has been working on this problem – fast, scalable geoprocessing for the web – for the past few years. In 2012 we released a new open source project called GeoTrellis (http://geotrellis.io/), an open source framework for high-performance (low latency), distributed geoprocessing. Built using the Scala programming language and based on the Akka and Spark frameworks, GeoTrellis is designed to create scalable, fast geoprocessing applications as well as parallelize geoprocessing operations for large geospatial datasets in order to take full advantage of distributed, multi-core architectures.
This talk will give an overview of the GeoTrellis framework; how it leverages features of Scala, Akka, Spark and other frameworks; and how it can be integrated with conventional web mapping tools to create apps that are more than just dots on a map. The talk will also give an overview of applications for online geoprocessing in several domains including: stormwater modeling, education games, infrastructure prioritization, climate change and transportation.
Spatial Data processing with Hadoop
by Ahmed Eldawy, PhD Candidate at the University of Minnesota
This talk describes GeoJinni, formerly SpatialHadoop; an open source full-fledged MapReduce framework with native support for spatial data [http://spatialhadoop.cs.umn.edu]. GeoJinni handles large scale spatial data by injecting spatial data awareness in each layer of Hadoop, namely, the language, storage, MapReduce, and operations layers. In the language layer, a new high level language, termed Pigeon, is proposed to work with standard spatial data types and operations. In the storage layer, GeoJinni supports standard spatial indexes, Grid File, R-tree and R+-tree, which are adapted to work in a distributed environment. The MapReduce layer contains new components to utilize the spatial indexes. The operations layer encapsulates many spatial operations such as range query, spatial join, computational geometry, and visualization operations. The extensibility and efficiency of GeoJinni allowed it to be used as a backbone in three real systems, SHAHED, a system for satellite data analysis and visualization, TAREEG, a MapReduce extractor for OpenStreetMap data, and MNTG, a web-based traffic generator.
GeoMesa: Scalable Geospatial Analytics
by Chris Eichelberger, Systems Engineer at CCRi
GeoMesa is an open-source platform that manages large-scale geo-time
records in the Accumulo key-value data store, and does so in a way that
is open an accessible. In this talk, we briefly introduce GeoMesa's
approach and capabilities, and then walk through a few examples of
at-scale analytics that GeoMesa enables, including: densities (heat
maps); interpolated space-time queries (tube selection); K-nearest
neighbors; streaming analytics and near-real-time visualization.
Big Spatial and Spatio-temporal Query Processing in Main Memory
by Suprio Ray, PhD Candidate at the University of Toronto
Spatial join is a crucial operation in many spatial analysis applications in scientific and geographical information systems. Due to the compute-intensive nature of spatial predicate evaluation, spatial join queries can be slow even with a moderate sized dataset. Another key problem with spatial join queries is the processing skew. Efficient parallelization of spatial join is therefore essential to achieve acceptable performance for many spatial applications. We introduce SPINOJA, a skew-resistant parallel in-memory spatial join infrastructure. SPINOJA introduces a declustering technique which partitions the spatial dataset such that the amount of computation demanded by each partition is equalized and the processing skew is minimized.
Whereas spatial join queries are characterized by complexity, when it comes spatio-temporal query processing the issues are volume and velocity. The key characteristics of the Location-Based Services (LBS) applications include a high rate of time-stamped location updates, and many concurrent historical, present and predictive queries. Traditional databases are unable to cope with the growing demands of many LBS systems. We present several key ideas to support high performance commercial LBS by exploiting in-memory database techniques. We introduce a novel spatio-temporal index called PASTIS. With extensive evaluation, we demonstrate that our system supports high insert and query throughputs and it outperforms the leading LBS system by a significant margin.
When & Where
LocationTech is a vendor neutral community for individuals and organizations who wish to collaborate on commercially-friendly open source software that is location aware.
LocationTech hosts technology projects and helps cultivate both an open source community and an ecosystem of complementary products and services.