Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Authors
In the world of data architecture, there is a dangerous myth that you have to choose “one tool to rule them all.” We often see organizations paralyzed by the debate: “Should we use a Database or a Data Lake?”
A spatial data pipeline architecture built for both large-scale analytics and operational queries is one of the harder infrastructure decisions a geospatial team makes. Most organizations try to solve it with a single tool. That is where the problems start.
The teams that get this right do not choose between a data lake and a spatial database. They use both, in a defined sequence known as the Geospatial Medallion Architecture. The medallion architecture, a data pipeline pattern established in the data lakehouse ecosystem, organizes data into three progressive quality layers: Bronze, Silver, and Gold. Wherobots and PostGIS each own a distinct role in that pipeline.
Wherobots is a cloud-native spatial analytics platform, built by the original creators of Apache Sedona, that processes large-scale geospatial datasets using distributed compute. PostGIS is an open-source spatial extension for PostgreSQL that adds support for geographic objects and enables location-based queries in SQL.
The spatial medallion architecture organizes geospatial data into three layers:
Think of your data not as static files, but as raw material (like crude oil or iron ore) that must be refined through a series of stages before it is valuable to your business.
Before we look at the solution, let’s look at the problem.
The Medallion Architecture solves this by organizing your data into three distinct layers of quality: Bronze, Silver, and Gold.
This approach allows you to use the right tool for the right job: Wherobots for heavy industrial refining, and PostGIS for precision delivery. Wherobots is a cloud-native spatial analytics platform, built by the original creators of Apache Sedona, that processes large-scale geospatial datasets using distributed compute. PostGIS is an open-source spatial extension for PostgreSQL that adds support for geographic objects and enables location-based queries in SQL.
The “Landing Zone”
In the spatial medallion architecture, the Bronze layer is the raw ingestion zone where all incoming data lands without transformation. This is the entry point for all your data. Whether its real-time telemetry from 10,000 delivery trucks, daily dumps of satellite imagery, or messy spreadsheets from a partner, it all lands here first.
Clean, Standardize, & Enrich
This is where the magic happens and where the heavy lifting is required. Raw data is rarely ready for business. It has duplicates, missing fields, or invalid geometries (like a building polygon that twists into itself).
AddressCloud runs property-level perils models for insurers, processing flood, fire, and climate risk data across millions of addresses. John Powell, Senior Geospatial Data Engineer at AddressCloud, describes what changed when they moved that workload into the Silver layer: “From a developer perspective, having data, algorithms and compute (and to be presented with a Spark/Sedona context in a Jupiter notebook on startup) combined in one platform is extremely powerful, comparable in many respects to Google Earth Engine, but with much greater guarantees of, and control over, job completion.”
The result: operations that previously took hours or days now complete in minutes, with no preprocessing step required to combine raster and vector data.
Aggregated & Ready for Business
This is the “Showroom” layer. This data is highly polished, aggregated, and formatted for specific business questions. For example: “Total Sales by Zip Code” or “Active Drivers by City.”
By adopting this strategy, you create a data supply chain that maximizes the strengths of every tool:
The spatial medallion architecture is not a tool choice. It is a pipeline pattern that assigns the right tool to the right job. If you are a leader looking to modernize your geospatial stack, don’t look for a “PostGIS replacement.” Look for a partner.
This hybrid approach, the Spatial Medallion Architecture, is how modern organizations turn location data into competitive advantage.This is part three of a series. The prior posts cover PostGIS vs Wherobots for spatial data lakehouses and spatial database cost comparisons.
Graph RAG for the Physical World
Introduction RAG (Retrieval Augmented Generation) has addressed one of AI’s biggest challenges for enterprise users: missing or hallucinating empirical business and real world context . Instead of generating answers from nothing, RAG retrieves relevant documents and feeds them to the model as context. It works. Ask an AI about your company’s Q4 revenue, and RAG […]
Building the Wherobots Mobility Solution Accelerator: A Technical Deep Dive
Three Notebooks, One Medallion Architecture, Full 4D GPS Trajectory Processing: Part 2 of 2
How well does SAM3 detect building footprints? Let’s ask the Wherobots Spatial AI Assistant!
In a recent post, we showed how easy it is to use RasterFlow and Meta’s Segment Anything 3 Model (SAM3) to detect features in the physical world. A single end-to-end pipeline built a 133 GB NAIP mosaic of Marion County, Oregon, ran SAM3 against it with text prompts spanning eight classes, and produced approximately one […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: