Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
Authors
In the world of data architecture, there is a dangerous myth that you have to choose “one tool to rule them all.” We often see organizations paralyzed by the debate: “Should we use a Database or a Data Lake?”
A spatial data pipeline architecture built for both large-scale analytics and operational queries is one of the harder infrastructure decisions a geospatial team makes. Most organizations try to solve it with a single tool. That is where the problems start.
The teams that get this right do not choose between a data lake and a spatial database. They use both, in a defined sequence known as the Geospatial Medallion Architecture. The medallion architecture, a data pipeline pattern established in the data lakehouse ecosystem, organizes data into three progressive quality layers: Bronze, Silver, and Gold. Wherobots and PostGIS each own a distinct role in that pipeline.
Wherobots is a cloud-native spatial analytics platform, built by the original creators of Apache Sedona, that processes large-scale geospatial datasets using distributed compute. PostGIS is an open-source spatial extension for PostgreSQL that adds support for geographic objects and enables location-based queries in SQL.
The spatial medallion architecture organizes geospatial data into three layers:
Think of your data not as static files, but as raw material (like crude oil or iron ore) that must be refined through a series of stages before it is valuable to your business.
Before we look at the solution, let’s look at the problem.
The Medallion Architecture solves this by organizing your data into three distinct layers of quality: Bronze, Silver, and Gold.
This approach allows you to use the right tool for the right job: Wherobots for heavy industrial refining, and PostGIS for precision delivery. Wherobots is a cloud-native spatial analytics platform, built by the original creators of Apache Sedona, that processes large-scale geospatial datasets using distributed compute. PostGIS is an open-source spatial extension for PostgreSQL that adds support for geographic objects and enables location-based queries in SQL.
The “Landing Zone”
In the spatial medallion architecture, the Bronze layer is the raw ingestion zone where all incoming data lands without transformation. This is the entry point for all your data. Whether its real-time telemetry from 10,000 delivery trucks, daily dumps of satellite imagery, or messy spreadsheets from a partner, it all lands here first.
Clean, Standardize, & Enrich
This is where the magic happens and where the heavy lifting is required. Raw data is rarely ready for business. It has duplicates, missing fields, or invalid geometries (like a building polygon that twists into itself).
AddressCloud runs property-level perils models for insurers, processing flood, fire, and climate risk data across millions of addresses. John Powell, Senior Geospatial Data Engineer at AddressCloud, describes what changed when they moved that workload into the Silver layer: “From a developer perspective, having data, algorithms and compute (and to be presented with a Spark/Sedona context in a Jupiter notebook on startup) combined in one platform is extremely powerful, comparable in many respects to Google Earth Engine, but with much greater guarantees of, and control over, job completion.”
The result: operations that previously took hours or days now complete in minutes, with no preprocessing step required to combine raster and vector data.
Aggregated & Ready for Business
This is the “Showroom” layer. This data is highly polished, aggregated, and formatted for specific business questions. For example: “Total Sales by Zip Code” or “Active Drivers by City.”
By adopting this strategy, you create a data supply chain that maximizes the strengths of every tool:
The spatial medallion architecture is not a tool choice. It is a pipeline pattern that assigns the right tool to the right job. If you are a leader looking to modernize your geospatial stack, don’t look for a “PostGIS replacement.” Look for a partner.
This hybrid approach, the Spatial Medallion Architecture, is how modern organizations turn location data into competitive advantage.This is part three of a series. The prior posts cover PostGIS vs Wherobots for spatial data lakehouses and spatial database cost comparisons.
How We Delivered “Fields of The World” with RasterFlow: A Planetary-Scale GeoAI Pipeline
See how we used RasterFlow to run a 100TB+ global GeoAI pipeline, from feature mosaics to predictions and vectors, with reproducible workflows.
Iceberg v3 Gets Native Geo Types. It’s More Than a Format Upgrade
Introduction Geospatial data touches nearly every industry, and until recently, the open lakehouse had no native way to handle it. Snowflake recently announced Iceberg v3 support with native geometry and geography types. It’s the first major engine to ship the geospatial extensions to the Iceberg spec. These types are now part of the open standard, […]
Take-aways from the 2026 Geospatial Embeddings Workshop at Clark University
Some brief take-aways from a workshop to set standards for storing and sharing geospatial embeddings.
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: