Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Welcome to This Month In Wherobots the monthly developer newsletter for the Wherobots & Apache Sedona community! In this edition we have a look at the latest Wherobots Cloud release, how the Overture Maps Foundation uses Apache Sedona to generate their data releases, processing a billion aircraft observations, building spatial data lakehouses with Iceberg Havasu, the new Apache Sedona 1.6.0 release, and more!
Wherobots announced significant new features in Wherobots Cloud to enable machine learning inference on satellite imagery via SQL, new Python and Java database drivers for interacting with WherobotsDB in your own analytics applications or data orchestration tooling, and a scalable vector tiles generator. These new enhancements are available now in Wherobots Cloud.
Read The Blog Post or Register For The Webinar
The Overture Maps Foundation publishes an open comprehensive global map dataset with layers for transportation, places, 3D buildings, and administrative boundaries. This data comes from multiple sources and is published in cloud-native GeoParquet format made publicly available for download in cloud object storage. In order to wrangle such a large planetary-scale dataset the Overture team uses Apache Sedona to prepare, process, and generate partitioned GeoParquet files. This blog post dives into the benefits of GeoParquet, how Overture uses Sedona to generate GeoParquet (including a dual Geohash partitioning and sorting method), and how to query and analyze the Overture Maps dataset using Wherobots Cloud.
Read the article: Making Overture Maps Data More Efficient With GeoParquet And Apache Sedona
Our featured Apache Sedona and Wherobots Community Member this month is Feng Jiang, a Senior Software Engineer at Microsoft where he works with map and geospatial data at scale. Through his involvement with the Overture Maps Foundation he also helps maintain and publish the public Overture Maps dataset. In the blog post “Making Overture Maps Data More Efficient With GeoParquet And Apache Sedona” he shared some insights gained from working with Apache Sedona at Overture in the pipeline used to create and generate GeoParquet data of planetary-scale map data. Thanks for your contributions and being a part of the Apache Sedona community!
An important factor to consider when analyzing aircraft data is the potential impact of weather and especially severe weather events on aircraft flights. This tutorial uses public ADS-B aircraft trace data combined with weather data to identify which flights have the highest potential to be impacted by severe weather events. We also see how to combine real-time Doppler radar raster data as well as explore the performance of working with a billion row dataset for spatial operations like point-in-polygon searches and spatial joins.
Read The Tutorial: Processing A Billion Aircraft Observations With Apache Sedona In Wherobots Cloud
Choosing the right tool for the job is an important aspect of data science, and equally important is understanding how the tools fit together and can be used alongside each other. This hands-on workshop shows how to leverage the scale of Apache Sedona with Wherobots Cloud for geospatial data processing, alongside common Python tooling like Geopandas, and how to add graph analytics using Neo4j to our analysis toolkit. Using a dataset of species observations we build a species interaction graph to find which species share habitat overlap, a common workflow for conservation use cases.
Watch The Workshop Recording: Large Scale Geospatial Analytics With Graphs And The PyData Ecosystem
Version 1.6.0 of Apache Sedona is now available! This version includes support for Shapely 2.0 and GeoPandas 0.11.1+, enhanced support for geography data, new vector and raster functions, and tighter integration Python raster data workflows with support for Rasterio and NumPy User Defined Functions (UDFs). You can learn more about this release in the release notes.
Read The Apache Sedona 1.6 Release Notes
This talk from Subsurface 2024 introduces the Havasu spatial table format, an extension of Apache Iceberg used to build spatial data lakehouses. We learn about the motivation for adding spatial functionality to Iceberg, how Havasu Iceberg enables efficient spatial queries for both vector and raster data, and how to use familiar SQL table interface when building large-scale geospatial analytics applications.
Watch The Recording: Building Spatial Data Lakehouses With Iceberg Havasu
Want to receive this monthly update in your inbox? Sign up for the The Spatial Intelligence Newsletter:
Introducing RasterFlow: a planetary scale inference engine for Earth Intelligence
RasterFlow takes insights and embeddings from satellite and overhead imagery datasets into Apache Iceberg tables, with ease and efficiency at any scale.
It takes 15 minutes for the Caltrain to get from Sunnyvale to SAP Center
That’s how long it took our MCP server to go from “how many bus stops are in Maryland” to an answer
Wherobots and Felt Partner to Modernize Spatial Intelligence
We’re excited to announce Wherobots and Felt are partnering to enable data teams to innovate with physical world data and move beyond legacy GIS, using the modern spatial intelligence stack. The stack with Wherobots and Felt provides a cloud-native, spatial processing and collaborative mapping solution that accelerates innovation and time-to-insight across an organization. What is […]
Scaling Spatial Analysis: How KNN Solves the Spatial Density Problem for Large-Scale Proximity Analysis
How we processed 44 million geometries across 5 US states by solving the spatial density problem that breaks traditional spatial proximity analysis
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: