Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Authors
👋 Welcome back to the latest edition of the Spatial Intelligence Newsletter! We’ve been busy brewing up some exciting things here at Wherobots, so we have plenty of new updates and content to share!
Don’t Let Messy GPS Slow You Down. The Fastest Way to Clean Up Messy GPS Data – And Save Money
Raw GPS data is messy. 😵💫 Noisy signals, lost connections, and inaccuracies make it hard to extract valuable insights. Imagine using your GPS to get to your location, only to find it telling you to drive over water instead of the road (personally, I’ve even had the map tell me to walk on water 🌊🚶🏻♀️).
Wherobots’ map matching corrects trajectories by aligning them with real-world road networks (❌no more walking on water! ), all while delivering unmatched accuracy and performance (and saving money!).
Apache Iceberg and Parquet now support GEO– A Huge Step Forward for Cloud Native Geo
Geospatial data has always been thought of as a second class citizen because of what modernized the data ecosystem of today, leaving geospatial data mostly behind. But that’s no longer the case. Thanks to the efforts of the Apache Iceberg and Parquet communities, both Iceberg and Parquet now support geometry and geography (collectively the GEO) data types! 🎉
What does this mean? With native geospatial data type support in Apache Iceberg and Parquet, you can seamlessly run query and processing engines like Wherobots, DuckDB, Apache Sedona, Apache Spark, Databricks, Snowflake, and BigQuery on your data. All the while benefitting from faster queries and lower storage costs from Parquet formatted data. 💨
Exploring design and key features to enhance spatial data workloads with Iceberg GEO
With Apache Icerberg and Parquet now supporting GEO types, this helps improve the economics of utilizing geospatial data in end solutions.This advancement allows organizations to create higher-value, lower-cost products and achieve faster results over time.
Let’s take a closer look at these GEO data types in Iceberg, exploring their design, key features, and implementation considerations. Learn how leveraging these features with Apache Sedona and Wherobots can enhance cost performance and data governance, ensuring the best possible experience for spatial data workloads. 📈
Optimizing Earth Observation Models for Production with ML Model Extension
What are the challenges of applying AI to geospatial problems? 🤖Join panel speakers from Wherobots, Radiant Earth, CRIM and Terradue as they discuss how this challenge led to the development of an open, portable solution for describing computer vision models trained on overhead imagery.
Learn about the MLM STAC Extension, its use cases, and why model developers should adopt it, along with Raster Inference– a serverless computer vision solution that extracts valuable insights from aerial imagery. 🌎
Interested in getting started with Wherobots, but unsure of where to begin? Here are some helpful resources. 👇
Wherobots 101: Mastering Scalable Geospatial Data Processing
Want to take your geospatial analytics to the next level? Whether you’re just starting out or already working with spatial data, learn how to leverage valuable tools and workflows in Wherobots Cloud to analyze, visualize and interpret geospatial datasets. From setting up your account to mastering advanced analytics, this session is a helpful guide to set you up for success!
Wherobots 102: Reading and Processing Cloud Native Geospatial Data
Learn how to efficiently load, manage and analyze raster and vector data in Wherobots’ hosted environment. Whether you’re working with massive geospatial datasets or looking for optimized workflows to write and query GeoParquet and Cloud-Optimized GeoTIFFs (COGs), this video will equip you with the tools and techniques to scale your geospatial analysis.
Working with Foursquare Places Data
Which neighborhood in San Francisco has the most coffee shops? Dive into the Foursquare Open Places dataset, a free and open dataset providing 100M+ global places of interest, with our latest tutorial. ☕
You’ll be able to query using Spatial SQL, subset the data for a specific region, search for specific businesses or places, and aggregate locations by geography. By the end of this tutorial, you’ll have a choropleth map showing the number of coffee shops, sorted by neighborhood.
Sedona Success Story: Optimizing ETL pipelines at scale with Comcast
🚀 Is scaling your ETL pipeline a priority? Discover how Comcast successfully achieved this by using Apache Sedona, all while boosting productivity and improving the quality of their network operations. 🌐
O’Reilly: Cloud Native Geospatial Analytics with Apache Sedona – Navigating Large-Scale Spatial Data
We know that handling large-scale spatial data can be daunting, which is why we’ve designed this guide to simplify geospatial data. This will help boost your spatial analytics expertise and transform the way you work with geospatial data! 💪
Our newest chapter, focusing on vector data analysis using spatial SQL, is now available. If you’ve already accessed the previous chapters, be sure to check your inbox (on a separate email) for the latest one! 📧
Engage with the Community Through Sedona Office Hours
We host monthly office hours to bring you the latest news and updates to Apache Sedona. Mark your calendars for the next one. Even if you can’t make it, we’ll send you the recording and slides to make sure you don’t miss anything that might be helpful to you. 🤝
Spatial Joins at Scale: Unlocking Advanced Geospatial Analytics
If you’ve ever struggled with Spatial Joins (you know who you are), then this is the one to join (pun intended, courtesy of Matt Forrest 😎)! Learn how to seamlessly integrate Python and Wherobots to perform advanced spatial joins and analyses on geospatial data.
Gain practical skills and best practices for processing and visualizing spatial data at scale. Don’t miss this opportunity to boost your spatial analytics expertise and transform how you work with geospatial data.
Fireside Chat with Overture Maps and Dotlas on Cloud-Native Geospatial: More Than Just Big Data
How is cloud-native geospatial reshaping the way organizations interact with spatial data? ☁️🌎 It prioritizes flexibility, changes how data consumers connect, removes friction, and unlocks new possibilities.
Join us, alongside Amy Rose from the Overture Maps Foundation and Eshwaran Venka from Dotlas, as we explore how modern approaches enable scalability across various compute infrastructures, eliminate the need to move massive datasets, and allow users to work with data wherever they are—whether locally or in the cloud. Hear about where geospatial technology is headed. This is a conversation you definitely don’t want to miss!
🆓 Getting started with Wherobots is easy. If you haven’t already, create a free account and dive in. If you’re looking to take your geospatial analytics to the next level—whether it’s full access to open datasets, map matching, or raster inference—try the Pro tier for free.
Introducing RasterFlow: a planetary scale inference engine for Earth Intelligence
RasterFlow takes insights and embeddings from satellite and overhead imagery datasets into Apache Iceberg tables, with ease and efficiency at any scale.
Mobility Data Processing at Scale: Why Traditional Spatial Systems Break Down
A Wherobots Solution Accelerator for GPS Mobility Analytics — Part 1 of 2
PostGIS vs Wherobots: What It Actually Costs You to Choose Wrong
When building a geospatial platform, technical decisions are never just technical, they are financial. Choosing the wrong architecture for your spatial data doesn’t just frustrate your data team; it directly impacts your bottom line through large cloud infrastructure bills and, perhaps more dangerously, delayed business insights. For decision-makers, the choice between a traditional spatial database […]
Streaming Spatial Data into Wherobots with Spark Structured Streaming
Real-time Spatial Pipelines Shouldn’t Be This Hard (But They Were) I’ve been doing geospatial work for over twenty years now. I’ve hand-rolled ETL pipelines, babysat cron jobs, and debugged more coordinate system mismatches than a person should reasonably endure in one lifetime. So when someone says “streaming spatial data,” my first reaction used to be […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: