Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Welcome to This Month In Wherobots the monthly developer newsletter for the Wherobots & Apache Sedona community! In this edition we have a look at the latest Wherobots Cloud release, how the Overture Maps Foundation uses Apache Sedona to generate their data releases, processing a billion aircraft observations, building spatial data lakehouses with Iceberg Havasu, the new Apache Sedona 1.6.0 release, and more!
Wherobots announced significant new features in Wherobots Cloud to enable machine learning inference on satellite imagery via SQL, new Python and Java database drivers for interacting with WherobotsDB in your own analytics applications or data orchestration tooling, and a scalable vector tiles generator. These new enhancements are available now in Wherobots Cloud.
Read The Blog Post or Register For The Webinar
The Overture Maps Foundation publishes an open comprehensive global map dataset with layers for transportation, places, 3D buildings, and administrative boundaries. This data comes from multiple sources and is published in cloud-native GeoParquet format made publicly available for download in cloud object storage. In order to wrangle such a large planetary-scale dataset the Overture team uses Apache Sedona to prepare, process, and generate partitioned GeoParquet files. This blog post dives into the benefits of GeoParquet, how Overture uses Sedona to generate GeoParquet (including a dual Geohash partitioning and sorting method), and how to query and analyze the Overture Maps dataset using Wherobots Cloud.
Read the article: Making Overture Maps Data More Efficient With GeoParquet And Apache Sedona
Our featured Apache Sedona and Wherobots Community Member this month is Feng Jiang, a Senior Software Engineer at Microsoft where he works with map and geospatial data at scale. Through his involvement with the Overture Maps Foundation he also helps maintain and publish the public Overture Maps dataset. In the blog post “Making Overture Maps Data More Efficient With GeoParquet And Apache Sedona” he shared some insights gained from working with Apache Sedona at Overture in the pipeline used to create and generate GeoParquet data of planetary-scale map data. Thanks for your contributions and being a part of the Apache Sedona community!
An important factor to consider when analyzing aircraft data is the potential impact of weather and especially severe weather events on aircraft flights. This tutorial uses public ADS-B aircraft trace data combined with weather data to identify which flights have the highest potential to be impacted by severe weather events. We also see how to combine real-time Doppler radar raster data as well as explore the performance of working with a billion row dataset for spatial operations like point-in-polygon searches and spatial joins.
Read The Tutorial: Processing A Billion Aircraft Observations With Apache Sedona In Wherobots Cloud
Choosing the right tool for the job is an important aspect of data science, and equally important is understanding how the tools fit together and can be used alongside each other. This hands-on workshop shows how to leverage the scale of Apache Sedona with Wherobots Cloud for geospatial data processing, alongside common Python tooling like Geopandas, and how to add graph analytics using Neo4j to our analysis toolkit. Using a dataset of species observations we build a species interaction graph to find which species share habitat overlap, a common workflow for conservation use cases.
Watch The Workshop Recording: Large Scale Geospatial Analytics With Graphs And The PyData Ecosystem
Version 1.6.0 of Apache Sedona is now available! This version includes support for Shapely 2.0 and GeoPandas 0.11.1+, enhanced support for geography data, new vector and raster functions, and tighter integration Python raster data workflows with support for Rasterio and NumPy User Defined Functions (UDFs). You can learn more about this release in the release notes.
Read The Apache Sedona 1.6 Release Notes
This talk from Subsurface 2024 introduces the Havasu spatial table format, an extension of Apache Iceberg used to build spatial data lakehouses. We learn about the motivation for adding spatial functionality to Iceberg, how Havasu Iceberg enables efficient spatial queries for both vector and raster data, and how to use familiar SQL table interface when building large-scale geospatial analytics applications.
Watch The Recording: Building Spatial Data Lakehouses With Iceberg Havasu
Want to receive this monthly update in your inbox? Sign up for the The Spatial Intelligence Newsletter:
How We Delivered “Fields of The World” with RasterFlow: A Planetary-Scale GeoAI Pipeline
See how we used RasterFlow to run a 100TB+ global GeoAI pipeline, from feature mosaics to predictions and vectors, with reproducible workflows.
Change Detection Using AlphaEarth Foundations (Part 2)
Continue exploring how Alpha Earth Embeddings reveal change over time using scores.
AlphaEarth Embeddings, Zonal Statistics, and PCA
Aggregate AlphaEarth embeddings over Iowa fields and visualize them with PCA.
Introducing the Wherobots Python SDK
What is the Wherobots Python SDK? The Wherobots Python SDK is a typed Python client for submitting, monitoring, and managing Wherobots job runs. It ships on PyPI as wherobots-python-sdk. One install, one API key, and you’re running spatial jobs from any Python environment: CI/CD pipelines, notebooks, a local shell. The SDK is built for three […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: