Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Authors
In 2026 we’re bridging the gap between AI and data from the physical world. Entering 2025, we knew we needed to prove Wherobots is fundamentally the best place to create and run spatial data workloads at scale. Last year we directed the vast majority of our energy at strengthening the core fundamentals– ease of use, cost, performance, and reliability, knowing this focus would resonate with customers and value would be amplified through what we build later on.
We knew we needed to bring spatial data into the modern data architecture, which from our vantage point, is the data lakehouse. If we did nothing, much of this data would otherwise remain siloed, “special”, and out of reach of modern analytics engines that could put this data to work. This is why we led contributions of GEO type support to Iceberg and Parquet.
Off-platform through the open source Apache Sedona project, we saw an opportunity to develop a lightweight query engine that would appeal to developers because it would provide the support they need out of the box and accelerate iterations with spatial data. And so SedonaDB was born.
This post is a high level summary of these and other accomplishments from our team in 2025. Now, we are actively building on top of this improved foundation, to enable AI and data practitioners across industries and use cases to operate with a heightened understanding of the physical world, for any area of interest.
This is what matters most. Everything else written here is just supporting evidence that shows how we made our customers more successful with spatial data this year. And what better proof than their own words?
Founder & CEO at Aarden.ai
Principal Research Scientist, Microsoft AI for Good
CTO, SatSure
Founder and CEO, ParGo
It’s no surprise that many of these customers are AWS customers. In late 2024, we launched Wherobots Cloud as a product AWS customers could subscribe to directly through the AWS Marketplace. This activated a key value distribution channel between Wherobots and AWS customers. It also started our partnership in earnest with AWS. We continue to work closely with the AWS team to bring world-class spatial capabilities into the hands of their customers so they can better realize their objectives with physical world data.
We were the team that led the introduction of GEO type support to Iceberg and Parquet, which led to the incorporation of GEO types in the Databricks Delta Lake project. Because these projects form foundational components of the modern data architecture, with GEO type support, a significant portion of spatial data could now be interpreted and safely interoperated on by common compute engines. That also meant it no longer had to live in siloed architectures. It could thrive in the common data estate – the data lake, processed using engines like Spark, Snowflake, BigQuery, or Wherobots. If you squint, there is now a clear path for making spatial data look just like “data” in the eyes of developers and AI systems, particularly when capable engines like Wherobots can crunch it without a problem.
Wherobots Cloud evolved into a full-fledged spatial intelligence platform designed not just for querying geospatial data very efficiently at scale, but for building production-grade workflows that integrate high value derivatives of physical world data into customers’ existing data architectures. Our native integration with Amazon S3 makes it possible for customers to run Wherobots on spatial data in their storage. In 2025 we announced our integration with Unity Catalog, enabling Databricks customers to activate the value of Wherobots on spatial data in their Databricks lakehouse. We will continue to add integrations such that our customers can just add Wherobots’ magic to the data infrastructure they already have.
The Apache Sedona community developed and launched SedonaDB, the first open-source, single-node analytical database engine that treats spatial data as a first-class citizen. They also made it significantly easier to compare query performance across engines using SpatialBench, the first benchmarking framework for spatial queries. SedonaDB makes spatial data significantly more useful and analytically accessible for a wider range of use cases and personas. SpatialBench is there to streamline the decision making process for users looking to choose an engine based on spatial query price-performance and capability. Here’s a link to the announcement for both.
We’re continuously improving the WherobotsDB engine, raising the bar we’re self-setting for spatial query price-performance and capability. In 2025 we announced multiple new functions, tools, and compatibility with GeoPandas. We also announced a preview of a new runtime version, 2.x, which contains the latest optimizations for spatial range queries, spatial filtering, and spatial joins. Rust is at the core, and it leverages vectorized execution. Compared to the first major version, 2.x further accelerates spatial queries of up to 3.3x.
While Wherobots is generally known for its spatial data capability, 2.x is significantly more performant for general purpose query operations, to a degree in which it’s also TPC-H competitive to alternative managed Spark engines in the market.We will be sharing benchmarking results when we announce general availability for version 2 soon.
We launched RasterFlow in private preview, the first serverless workflow purpose built to prepare and perform inference on large scale Earth observation (EO) datasets. RasterFlow is an Earth Intelligence solution, addressing the infrastructure challenges and high costs that prevented companies from utilizing raw EO data in the first place. We’ve packaged years of GeoAI expertise into a serverless, easy to use product. RasterFlow creates inference-ready mosaics after digesting large, unprepared imagery datasets, and runs model inference on these mosaics with custom or open PyTorch models to perform tasks such as change detection, classification, and segmentation. Results are delivered as geometries in Iceberg tables in a customer’s S3 bucket or as tables in Databricks Unity Catalog, to be processed by WherobotsDB or alternative engines.
Physical world data is noisy, it’s large, and it’s generally semi-structured or unstructured. This data also needs more context in order for it to be useful, which requires spatial joins to other datasets. It can be publicly available, or reside as private assets within an organization’s data estate.
Teams and AI systems alike need this data to be processed and contextualized with other data in order for it to be useful. At scale Wherobots provides arguably the best tools for spatial processing and contextualization at the most fundamental level, for the modern data architecture. Now Wherobots is ready to be wired up to AI systems.
Late in 2025 we announced the availability of the Wherobots MCP server to give your LLM access to Wherobots’ tools. Now, LLMs can use the MCP server to efficiently design queries by understanding the spatial and non spatial data in your data estate (via the S3 and Unity Catalog integration), and run those queries on a high performance engine to answer questions about the physical world.
Soon we’ll integrate the MCP server with RasterFlow. That way, an AI agent can design and trigger a workflow in Wherobots that starts with fresh EO data, prepares it for inference, generates predictions using a collection of PyTorch machine learning models, and perform additional enrichment or transforms if needed to produce the result. Join the upcoming office hour on MCP server to learn more.
You can reach out to the product team at product@wherobots.com, or me directly at damian@wherobots.com to share the challenges you’re facing and see how we can solve them now or with capabilities we’ll add. Here are a few discrete roadmap items we are working on, and categories of investment planned in 2026.
Introducing RasterFlow: a planetary scale inference engine for Earth Intelligence
RasterFlow takes insights and embeddings from satellite and overhead imagery datasets into Apache Iceberg tables, with ease and efficiency at any scale.
Wherobots and Felt Partner to Modernize Spatial Intelligence
We’re excited to announce Wherobots and Felt are partnering to enable data teams to innovate with physical world data and move beyond legacy GIS, using the modern spatial intelligence stack. The stack with Wherobots and Felt provides a cloud-native, spatial processing and collaborative mapping solution that accelerates innovation and time-to-insight across an organization. What is […]
Scaling Spatial Analysis: How KNN Solves the Spatial Density Problem for Large-Scale Proximity Analysis
How we processed 44 million geometries across 5 US states by solving the spatial density problem that breaks traditional spatial proximity analysis
The Medallion Architecture for Geospatial Data: Why Spatial Intelligence Demands a Different Approach
When most data engineers hear “medallion architecture,” they think of the traditional multi-hop layering pattern that powers countless analytics pipelines. The concept is sound: progressively refine raw data into analytical data and products. But geospatial data breaks conventional data engineering in ways that demand we rethink the entire pipeline. This isn’t about just storing location […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: