Connect your AI coding assistants to the physical world with Wherobots MCP and CLI Learn More

Wherobots Spatial Intelligence Engine Integrates with Databricks Unity Catalog for Spatial Data

databricks geospatial wherobots blog image

TL;DR: Wherobots now integrates with Databricks Unity Catalog, enabling users to process spatial data up to 20x faster with 60% cost savings. This integration supports raster/vector data, 300+ spatial functions, and enterprise security—all while maintaining full compatibility with Apache Sedona and Spark.

Databricks Geospatial Performance with Wherobots

  • 5-20x faster spatial query performance
  • 60% cost reduction on spatial workloads
  • 300+ spatial SQL/Python/Scala functions
  • 100% Apache Sedona compatibility (zero code changes)

Databricks users can now enhance their geospatial analytics capabilities with Wherobots, a spatial intelligence engine purpose-built for processing data from the physical world. Wherobots brings advanced raster processing, computer vision ML inference, and industry-leading performance to your existing Databricks environment.

With Wherobots Data Federation for Unity Catalog, you can:

  • Expand spatial coverage to grow revenues, improve margins, and make better decisions with complete spatial intelligence
  • Create innovative spatial data products that leverage aerial imagery, IoT sensors, and mobility data at planetary scale
  • Build with raster, vector, and tabular data using familiar SQL, Python, or Scala interfaces
  • Run computer vision ML models on geo-imagery and sensor datasets from local to continental scale
  • Migrate existing workflows with zero code changes—WherobotsDB is fully compatible with Apache Sedona and Spark

Customers like Dotlas, Leaf Agriculture, and Overture are achieving step-function improvements in performance, cost efficiency, and innovation by integrating Wherobots with their Databricks platforms.

Wherobots Data Federation connects directly to Unity Catalog, allowing you to read from and write to Iceberg or Delta tables with Databricks service principal and OAuth or PAT token authentication—no data migration required.

Get started | Schedule a demo

What is Spatial Intelligence?

Spatial intelligence is the understanding of features of interest, and their relationships across space and time in multi-dimensional environments. With it, bridges are formed between digital and physical worlds. Decision making can improve, and you can create better products or services with higher returns.

  • Simple spatial intelligence: Customer visits grouped for a location or area. But simplified formats like aggregations inherently lack precision, and you need to make tradeoffs between cost and resolution, all of which limit their usefulness. These aggregations are typically grouped by cells in a grid (like H3) and most commonly used to create visualizations. While interesting to look at, visualizations can be used to support a decision, intuition, or analysis, but they are generally less actionable because precision or other context is missing. 
  • Complete spatial intelligence: Forecast, or a composite risk, opportunity, or value score associated with potentially millions of specific assets or features across a continent derived from any valuable combination of IoT, location, building, weather, road network, terrain, crops, parcel, aerial imagery, BI, or mobility datasets. By processing perspectives about features of interest from multiple valued aspects, the complete picture forms, which becomes highly actionable, and extremely useful intelligence.  But even the most popular data platforms still don’t make it easy to create.

Wherobots enables complete spatial intelligence at scale with Databricks Unity Catalog integration.

Why Do Traditional Data Platforms Struggle with Geospatial Data?

There are many cloud data engines and warehouses that support the simple form described above, including BigQuery, Snowflake, and now Databricks with its Spatial SQL support in preview. However they still lack feature and data type support, reasonable query price-performance at scale, and the solution expertise you may need to create a complete form of spatial intelligence. Here’s why.

Shaped by demand, most data platforms were first designed to handle the structured data exhaust from the web and devices connected to it, not data collected from or about the physical world.

Physical world data is inherently complex, unstructured, and doesn’t fit neatly into key-based joins. It takes a specialized compute engine to make it easy to create solutions from spatial data. Easy means it’s capable of fusing and transforming various spatial and non spatial data types with high accuracy, scale, performance, and low cost, while ensuring development is productive with the teams you have.

The spatial extensions and APIs for today’s big data engines and warehouses provide limited support for simple workloads. But because of design bottlenecks, missing features, and limited technical support, spatial solutions on these platforms can be expensive and difficult to build, while ideas remain far-fetched.

What Makes a Modern Spatial Intelligence Solution Effective?

Ideally the solution for creating complete spatial intelligence just fits into your existing software development workflows, already supports your future needs and the data you want to utilize, is accessible to the teams you have, and just performs at the right scale — on-demand, at a cost that encourages innovation. It’s lakehouse ready, so you don’t need to move your data or utilize proprietary formats or data types to use it. You also have dedicated expertise in reach to unblock innovation.

With this capability at your fingertips, ideas can flow and innovation takes place. Your business can reach higher levels of efficiency, reducing costs, carbon footprint, and risk. You can speed up deliveries or pickups, increase the effectiveness of CAPEX, improve consistency, grow revenue, and build in ways that were thought to be impossible.

This capability is Wherobots, and it’s directly available to Databricks users via data federation with Databricks Unity Catalog.

How Wherobots Solves Geospatial Data Challenges

Using Wherobots you can easily build a complete picture of what’s happened, over space and time, and integrate this intelligence into your Databricks data platform to drive growth – faster and at a lower cost than ever. Our mission is to make spatial data easy to utilize, and it’s all we are focused on. The results of our focus speak for themselves.

Wherobots Spatial Intelligence

Databricks Unity Catalog Geospatial Integration: Key Wherobots Features

Wherobots makes it easy and economical to produce local to planetary scale data solutions that rely on any combination of aerial and overhead imagery, IoT and mobility data, ground truth datasets, and your own business context. And using Wherobots Data Federation with Unity Catalog, you can easily integrate the data products you build with Wherobots, into your Databricks data platform while retaining custody and governance of data.

What Types of Spatial Data Does Wherobots Support? (Raster & Vector)

There are two main classes of spatial data supported by Wherobots: raster and vector data. You also get the support and scale you’d expect for tabular data operations from Wherobots’ Spark compatible engine.

  • Raster data is typically a collection of sensor or imagery data, where each pixel in the image represents information about what is being captured, like temperature, elevation, infrared spectrum, etc. File formats include GeoTIFFs, Zarr, and NetCDF and more. Raster datasets are commonly GBs to TBs in scale.
  • Vector data is a collection of multi-dimensional geometries or geographies that represent the trajectory, shape, elevation, and location of things. They can be trips, points, and outlines of features like buildings or parcel and crop boundaries. File formats include GeoParquet (soon to be Parquet), Shapefiles, and GeoJSON.

Performance Benchmarks: Up to 20x faster queries, 60% cost savings

Customers like Leaf Agriculture, Dotlas, Overture and others have compared the price-performance of using Wherobots for their spatial data workloads vs other managed Spark or other leading data platforms. Subscribed to the Professional Edition, they are self-reporting up to 20x better performance (5x-20x is typical) with on-demand savings reaching as high as 60%, with even higher savings from the Enterprise Edition.

Data teams are equally less limited by scaling bottlenecks. This becomes apparent after workloads finish faster on smaller WherobotsDB runtimes, and after customers realize they have significant headroom to scale well past their existing needs.

“Previously, our data volumes and processing requirements were increasing faster than we could keep up with, burdening our team with costly rebuilds. Now with Wherobots, not only can we easily scale to millions of acres, we also can rest assured that our costs won’t spiral out of control.”

– G. Bailey Stockdale, CEO Leaf Agriculture

These results are a function of specialization and a company-wide focus; WherobotsDB was built first for processing spatial data. This intentional design obviates the typical performance bottlenecks and complexities now alive in leading data platforms and warehouses, which were first designed for purposes unrelated to processing spatial data.

While the quotes from our customers matter the most, we also know how important open performance benchmarks are. But currently there are no spatial query benchmarks (or at least reputable ones), which makes query performance hard to compare across platforms without trials. It’s also hard to claim progress was made on performance when standards have not been established. We’re working on this too, and soon we will release a new open source spatial query benchmarking framework for Apache Sedona, and we will release spatial query performance results for query engines, data warehouses, and data platforms. 

We already have preliminary results that compare WherobotsDB to Apache Sedona on various managed Spark engines along with query performance from engines with Spatial SQL APIs. Feel free to reach out and we can share these results when you contact us.

Security & Compliance: Enterprise Grade Data Protection

Wherobots is serverless and built for data security first. There’s no infrastructure to manage, although customers can also choose to run Wherobots in their AWS VPC for maximum control.

Apache Sedona Expertise: Built by the Original Creators

Wherobots was founded by the original creators of Apache Sedona, and Sedona is the most widely used geospatial extension for Apache Spark and in Databricks. With decades of research and experience with spatial data, open source, and cloud-scale systems, our product and team are ready to support Databricks customers’ solutions on the lakehouse.

We’re also a team leading geospatial modernization efforts in open source. Wherobots has supported GEO types for years with our Havasu table format. But rather than keeping this support in-house, we decided these types would better serve the physical world in the open so we proactively drove support for them in Apache Iceberg and Parquet.

When to Use Databricks Native Spatial SQL vs. Wherobots

Choose Databricks Native Spatial SQL when you need:

  • Exploratory geospatial analysis
  • Basic spatial joins (ST_Intersects, ST_Contains, ST_Distance)
  • Standard point-in-polygon queries
  • Simple location-based aggregations
  • No raster or satellite imagery processing

Choose Wherobots for Databricks when you need:

  • Production spatial intelligence workloads at scale
  • Advanced raster and vector data processing
  • Computer vision ML inference on satellite/aerial imagery
  • Performance optimization (5-20x faster queries)
  • Cost reduction (up to 60% savings on spatial compute)
  • Planetary-scale datasets (TB to PB range)
  • Complex spatial ETL pipelines
  • Apache Sedona compatibility for existing workflows

Get Started with Wherobots x Databricks Geospatial Analytics

The lakehouse gives you the ability to choose the product best suited for the job. Don’t settle for the simple form of spatial intelligence or what the default provider offers, when complete is in reach with better economics, scale, capability, performance, and support.

By integrating Wherobots into your Databricks workflows, organizations can reduce costs, improve operations, and realize new innovations powered by data from the physical world.

Ready to enhance your Databricks geospatial capabilities?

Processing spatial data in Databricks? Check out this spatial query benchmark on Databricks with SpatialBench.

FAQ

What is the difference between simple and complete spatial intelligence?

Simple spatial intelligence aggregates location data into grid-based summaries such as H3 cells. These are useful for visualization but limited in precision and actionability because simplification requires tradeoffs between cost and resolution. Complete spatial intelligence derives composite risk scores, forecasts, or opportunity scores for specific assets across a region by fusing multiple data sources including imagery, IoT, mobility, weather, and parcel data. Wherobots is built to produce the complete form at scale inside your existing Databricks environment.

Why do traditional data platforms struggle with geospatial data?

Most data platforms were designed for structured web and device data, not physical world data. Spatial data is inherently complex and does not fit neatly into key-based joins. The spatial extensions in platforms like BigQuery, Snowflake, and Databricks cover basic spatial SQL but lack the raster processing, advanced spatial functions, and query planner optimizations needed for production-scale spatial workloads.

When should I use Databricks native spatial SQL instead of Wherobots?

Databricks native spatial SQL is a good fit for exploratory analysis, basic spatial joins, standard point-in-polygon queries, and simple location-based aggregations where no raster or imagery processing is needed. Wherobots is the better choice for production spatial workloads at scale, raster and vector data processing, computer vision ML inference on satellite imagery, complex spatial ETL pipelines, and workloads where query performance and cost matter at the terabyte to petabyte range.

What is Wherobots Data Federation with Unity Catalog?

Wherobots Data Federation is the integration layer that connects Wherobots Cloud directly to Databricks Unity Catalog. It allows Wherobots to read from and write to Iceberg or Delta tables governed by Unity Catalog using Databricks service principal and OAuth or PAT token authentication, without requiring data migration or duplicating governance controls.

Who built Wherobots and what is the connection to Apache Sedona?

Wherobots was founded by the original creators of Apache Sedona, the most widely used geospatial extension for Apache Spark and Databricks. Wherobots is the managed cloud platform built on top of Apache Sedona, extending it with enterprise performance optimizations, raster processing, and GeoAI capabilities while maintaining full API compatibility so existing Sedona workloads run on WherobotsDB without code changes.