Feature Boundary — BYOS vs Cloud

Policy: Every feature request must pass one question — does this belong on-prem or in cloud? The default is cloud. The runtime gets a feature only when it has no cloud equivalent or is a prerequisite for the burst pathway. The feature boundary owner has authority to reject any runtime feature that reduces cloud migration incentive.

On-Prem BYOS

300 spatial functions (vector + raster)
Standard Spark UDFs — ST_, RS_ functions, spatial joins, distance, network analysis
OutDB raster processing
Query large rasters in S3/STAC without full ingestion. Unique BYOS capability. Minutes vs hours.
2x speedup over bare Sedona
On Databricks Runtime (DBRX). Validated via SpatialBench.
Sedona API compatibility
Drop-in replacement. Existing workloads run without code changes.
Spatial join optimization
Optimizer rule injection via injectOptimizerRule, injectPlannerStrategy
GeoParquet + Havasu format
Data Source V2 connectors, spatial indexing, format support
Embedded metering agent
Cryptographic telemetry, per-core consumption billing via Wherobots backend
Pro subscription token auth
License management, periodic phone-home validation

Wherobots Cloud Only

RasterFlow — satellite imagery ML inference
Planetary-scale raster AI. No equivalent exists on-prem or in Databricks.
Wherobots Cloud — Rust engine
3.3x faster than v1. Requires Spark internal modifications. Cloud-only by architecture.
MCP Server (GA)
Natural language dataset discovery, query execution, code generation. Works with Claude, Cursor.
VS Code Extension (Preview)
In-editor spatial AI. Local code, cloud execution. Single environment from exploration to prod.
CLI (Preview)
Autonomous agent execution. Near-full API access: SQL sessions, job runs, logs, debugging.
Burst compute
Elastic scale beyond on-prem cluster capacity. Callable via single API from runtime pipeline.
Spatial Catalog
Dataset discovery, managed geospatial assets, trusted spatial datasets
Enterprise SSO, SLA, dedicated environments
Compliance, audit logging, dedicated compute, guaranteed uptime

Wherobots spatial compute platform · Last updated April 2026

Spatial Compute Options — Three Paths to Enterprise Scale

Apache Sedona (OSS) is where most enterprise spatial workloads begin. As requirements grow — joins at scale, raster processing, KNN queries, production SLAs — teams face a choice: extend the existing Sedona environment with a drop-in enterprise runtime, or move to a fully managed spatial compute platform.

This tool compares three options across capabilities, performance, and total cost of ownership — so you can evaluate the right fit for your workloads, data residency requirements, and operational model.

The three options

Option A

Apache Sedona (OSS)

A frequently used option for many enterprises. Free, Apache 2.0. Where most teams start.

338 spatial functions on any Spark environment. Built by the same team that builds WherobotsDB. Extends native capabilities significantly — but no automatic optimization, no out-of-database raster, no enterprise SLAs, and completes only 6 of 12 SpatialBench SF-1000 patterns.

338 functions · Free

Option B

WherobotsDB on Your Cluster

BYOS (Bring Your Own Spark) — a .jar on your existing Spark clusters. Built by the creators of Apache Sedona.

353 spatial functions, automatic query optimization, out-of-database raster and raster algebra, approximate KNN, and enterprise SLAs — all on your existing infrastructure. Governance unchanged. No data movement required. ~2× faster than Apache Sedona on identical infrastructure.

353 functions · 2× faster · Licensed

Option C

Wherobots Cloud

Fully managed spatial compute. Rust-native, Arrow-columnar engine.

Everything in Option B on the next-generation runtime — 3× faster, 20–30% better price-performance — plus RasterFlow managed Earth observation pipeline, spatial AI developer tools (VS Code Extension, MCP Server, CLI), and the complete spatial platform. Completes all 12 SpatialBench SF-1000 patterns.

353 functions · Fully managed

⚡

Need a specific ST_ or RS_ function? Wherobots can add new spatial functions — including custom ST_ vector and RS_ raster functions — quickly and on request for paying customers. If your workload depends on a function not currently in the library, reach out to discuss.

How to think about the options

The options represent a spatial maturity curve. Many enterprises today run Apache Sedona (Option A). As workloads grow, the question is which option removes the next bottleneck without unnecessary operational overhead.

Option B (BYOS) is the right step when Sedona's ceiling is a problem but managed cloud isn't an option yet. Same code, better performance, no infrastructure change — with a clear path to Option C.

Option C (Wherobots Cloud) is the full platform: managed infrastructure, RasterFlow for Earth observation at scale, purpose-built spatial AI developer tools, and the next-generation Rust runtime.

Key decision factors

Infrastructure control & data residency

If you must keep compute on your own infrastructure — for compliance, procurement, or architecture reasons — Option B (BYOS) delivers enterprise spatial without requiring a managed platform.

Raster and Earth observation workloads

Out-of-database raster is available in Options B and C. RasterFlow — managed end-to-end raster ingestion, mosaicking, and ML inference — is exclusive to Option C (Wherobots Cloud).

Spatial AI developer tools

VS Code Extension, MCP Server, and CLI for spatial AI coding are exclusive to Wherobots Cloud (Option C). These lower the skill bar for the whole team, not just GIS specialists.

What's in this tool

Feature Comparison

Side-by-side capability matrix across all three options, grouped by capability area. Includes SpatialBench SF-1000 results and a final summary of each option.

TCO Calculator

Interactive monthly cost comparison: Apache Sedona OSS vs WherobotsDB BYOS vs Wherobots Cloud. Adjustable for cluster size, hours, EC2 cost, engineering overhead, and licensing.

Function Reference

Searchable table of all 363 spatial functions across 32 categories, with per-platform availability for Wherobots Cloud/BYOS, Apache Sedona, and the BYOS runtime JAR.

Feature Comparison — Apache Sedona · WherobotsDB · Wherobots Cloud · April 2026

Option A — Apache Sedona (OSS)

338 functions

Frequently used by many enterprises. Free, Apache 2.0.

Option B — WherobotsDB BYOS

353 · 2× faster

Bring Your Own Spark .jar on your existing clusters.

Option C — Wherobots Cloud

353 · 3× faster

Fully managed. All of Option B plus RasterFlow & AI tools.

Side-by-side comparison

Capability	Option A — Apache Sedona (OSS) Frequently used by many enterprises	Option B — WherobotsDB BYOS (Bring Your Own Spark)	Option C — Wherobots Cloud ★ Greatest capability
Fit
Best for	Teams starting with spatial analytics; proof-of-concept before licensing; Spark-native expertise and tolerance for manual tuning	Enterprise spatial workloads where self-managed infrastructure must remain the compute platform; data residency requirements	Maximum spatial performance; Earth observation workloads; teams wanting managed infrastructure and spatial AI developer tools
Economics & Infrastructure
Cost model	Free (Apache 2.0)	WherobotsDB license + your compute	Wherobots Cloud pay-as-you-go
Infrastructure	Customer-managed	Customer-managed (drop-in .jar)	Fully managed by Wherobots
Enterprise support	✗ Community only	✓ Dedicated SLAs	✓ Dedicated SLAs
Core Spatial Capabilities
Spatial functions	338 — full Apache Sedona function set	353 — all of Apache Sedona plus Wherobots-exclusive capabilities	353 — Cloud-optimized runtime
Query optimization Auto join strategy, skew prevention, spatial acceleration	✗ Manual tuning required	✓ Automatic join selection, dynamic optimization, spatial relationship acceleration	✓ Automatic + purpose-built engine optimizations
Spatial Joins at Scale: SpatialBench SF-1000 Results See full SpatialBench results table below ↓	Completes Q1–Q5, Q7 (6 of 12). Fails Q6, Q8–Q12 ↓	✓ Completes all 12 SpatialBench patterns at SF-1000	✓ Completes all 12 SpatialBench patterns at SF-1000
Approximate KNN Find the N nearest features to each record	✗ Not available — requires comparing every record against every other, which becomes impractical at enterprise data volumes	✓ Available	✓ Available + optimized runtime
Sedona API compatibility	✓ Is Sedona	✓ 100% compatible	✓ 100% compatible
Raster & Out-of-Database Processing
Out-of-database raster On-demand pixel loading from S3/STAC — no full ingestion required	✗ Must load full file into executor memory	✓ On-demand loading, intelligent caching, Cloud-Optimized GeoTIFF	✓ Full + managed runtime optimizations
Raster processing & map algebra Zonal stats, raster algebra, raster-to-vector, multi-band	✓ In-memory raster processing	✓ Full suite: zonal stats, raster algebra, raster-to-vector, spatial filter push-down	✓ Full suite + distributed raster tile generation
Earth Observation & Raster AI (RasterFlow) — Exclusive to Wherobots Cloud
Mosaic building from satellite / aerial imagery Sentinel-2, NAIP, and custom sources across an area of interest	✗	✗ Cloud-only	✓ Private Preview
Computer vision model inference Semantic segmentation, regression, patch-based processing at planetary scale	✗	✗ Cloud-only	✓ Private Preview
Built-in models (Model Hub) Agricultural field mapping (Fields of the World), urban infrastructure (Tile2Net), canopy height (Meta CHM v1), rural roads (ChesapeakeRSC)	✗	✗ Cloud-only	✓ 4 built-in + BYOM
Bring your own PyTorch model	✗	✗ Cloud-only	✓
Vectorize model outputs → spatial analysis Convert raster predictions to vector geometries for further spatial analysis	✗	✗ Cloud-only	✓
Analytics & Intelligence
Geostatistics DBSCAN clustering, Getis-Ord Gi* hotspot detection, outlier analysis	✓ Available	✓ Available	✓ Built-in, distributed, purpose-built for scale
Location intelligence Reverse geocoding (Overture Maps), isochrones, map matching	✗	Optional add-on	✓ Included + PMTiles generation
Spatial AI Coding Tools — Exclusive to Wherobots Cloud
VS Code Extension (Spatial AI Coding Assistant) AI-assisted spatial notebook development, job submission, workspace and cost management from within VS Code	✗	✗ Cloud-only	✓ All editions
MCP Server for Spatial AI Natural language spatial data discovery and SQL generation. Works with Claude, Cursor, VS Code Copilot.	✗	✗ Cloud-only	✓ GA — Pro + Enterprise
CLI for spatial AI coding Command-line interface for spatial workflow automation and AI coding support	✗	✗ Cloud-only	✓
Performance
Performance vs Apache Sedona Measured on identical infrastructure vs bare Sedona	Baseline	✓ ~2× faster on identical infrastructure	✓ ~3× faster (Rust-native, Arrow-columnar runtime)
Cloud-optimized runtime Rust-native, Arrow-columnar, geometry as a first-class type	Not applicable	Not applicable	✓ 3× faster queries, 20–30% better price-performance

Sources: docs.wherobots.com · wherobots.com/pricing · SpatialBench — April 2026

SpatialBench SF-1000 — 520 GB · 6 billion records · 12 OGC query patterns

Both WherobotsDB options complete all 12 patterns. Apache Sedona completes Q1–Q5 and Q7 (6 of 12) — the failing queries map directly to revenue-critical workloads: territory assignment, proximity scoring, cross-zone tracking, KNN routing, and boundary reconciliation.

Query	Pattern description	Wherobots Cloud (Option C)	WherobotsDB BYOS (Option B)	Apache Sedona OSS (Option A)
Q1–Q5, Q7	Standard filtering, aggregation, convex hull, basic joins	✓ Pass	✓ Pass	✓ Pass
Q6	Zone statistics aggregated within a search radius	✓ Pass	✓ Pass	✗ Fail
Q8	Radius-based spatial join — count nearby features per building	✓ Pass	✓ Pass	✗ Fail
Q9	Polygon-on-polygon overlap detection and IoU calculation	✓ Pass	✓ Pass	✗ Fail
Q10	Zone statistics computed via spatial join	✓ Pass	✓ Pass	✗ Fail
Q11	Cross-zone trip counting — two spatial joins per record	✓ Pass	✓ Pass	✗ Fail
Q12	KNN join — 5 nearest neighbors per record	✓ Pass	✓ Pass	✗ Fail

Q8/Q10 = territory and zone assignment · Q11 = cross-zone movement tracking · Q12 = nearest-facility routing · Q9 = boundary reconciliation and planned-vs-actual overlap.

Final summaries

Option A — Apache Sedona (OSS)

338 spatial functions. Free, Apache 2.0. A frequently used option for many enterprises — built by the same team that builds WherobotsDB. Completes Q1–Q5 and Q7 of SpatialBench SF-1000 (6 of 12 patterns).

Where it stops

No auto optimization · No out-of-DB raster · No KNN · Fails Q6, Q8–Q12 · Community support only

Option B — WherobotsDB BYOS

A .jar on your existing clusters. 353 functions, automatic optimization, out-of-DB raster, enterprise SLAs. ~2× faster than bare Sedona on identical infrastructure. Completes all 12 SpatialBench SF-1000 patterns.

When to move to Option C

RasterFlow managed ML pipeline · Rust runtime performance · Spatial AI coding tools · Fully managed infrastructure

Option C — Wherobots Cloud

All of Option B on the next-generation Rust-native, Arrow-columnar runtime. 3× faster. 20–30% better price-performance. Fully managed. Complete spatial AI toolset and RasterFlow. All 12 SpatialBench patterns.

Exclusive capabilities

RasterFlow (mosaic, CV inference, Model Hub, BYOM) · VS Code Extension · MCP Server · CLI · Spatial Data Catalog

Case Studies — OSS-to-Cloud Strategy Precedents

Kong

Apache 2.0 gateway → Enterprise runtime + Konnect cloud

Most relevant

ARR at $100M

Dec 2023

Valuation

$2B

Total raised

$345M

Employees

~1,000

The model: Kong Gateway Enterprise = same binary as OSS with enterprise features unlocked by license key. Konnect cloud = hosted control plane, customer data planes run anywhere. Self-managed Enterprise first, cloud second — 3 years of self-managed revenue funded the cloud build.

Wherobots parallel: Direct analog. WherobotsDB BYOS = Kong Gateway Enterprise. Wherobots Cloud = Konnect. The hybrid deployment model is proven at scale.

Caution: Kong had 13× Wherobots' funding. Build the minimum viable version of this model, not Kong's full product surface.

Grafana Labs

OSS agent (Alloy) → cloud-only monetization

Hybrid model

ARR

$400M+

Users

20M

Monetized

~1%

Growth

69% YoY

The model: Grafana Alloy (on-prem agent) is free and open source. Zero direct revenue. All monetization through Grafana Cloud: metrics, logs, traces billed on consumption. The agent creates adoption gravity; the cloud captures revenue.

Wherobots difference: WherobotsDB BYOS is not free — it is metered. This is better than Grafana's model: Wherobots captures revenue at both surfaces, not only cloud. Structurally closer to Kong.

Key lesson: The cloud wins on operations cost, not features alone. Self-hosting the LGTM stack at scale requires 1–2 dedicated SREs. Operational burden drives migration.

Confluent

Kafka self-managed → cloud via Cluster Linking bridge

Bridge model

Cloud revenue share

55%+

Total subscription

~$1.1B

Cloud growth

44% YoY

Self-managed

Still growing

The bridge mechanism: Cluster Linking mirrors topics bidirectionally between on-prem Kafka and Confluent Cloud. Customers migrate gradually, consumers first, without full re-architecture. Cloud revenue grew without requiring migration as a hard prerequisite.

Wherobots parallel: The burst pathway from WherobotsDB BYOS to Wherobots Cloud is the equivalent of Cluster Linking — workloads call cloud for what on-prem cannot do (RasterFlow inference), creating progressive cloud dependency.

Key lesson: After a decade of effort, 45% of revenue is still self-managed. The bridge reduces migration risk; it does not guarantee migration speed.

MongoDB / Atlas

Self-managed → cloud via Atlas-only features

Feature gap model

Atlas revenue share

75%+

Growth in 3 years

50% → 75%

Self-managed trend

Declining

Driver

Feature gap

What drove migration: Atlas-exclusive features — Atlas Search, Vector Search, Stream Processing, App Services — had no self-managed equivalent. The migration tools (Live Migration, mongomirror) were free and frictionless, but they were pull mechanisms. The reason customers migrated was Atlas-only capabilities, not the bridge.

Wherobots parallel: RasterFlow, MCP Server, VS Code Extension, CLI — these are the Atlas-equivalent features. Every developer hour saved by the Spatial AI toolset is a reason to move workloads to cloud.

Key lesson: The feature gap is the real migration driver. The bridge lowers friction, but cloud-exclusive capabilities are what make customers want to cross.

Synthesis: The companies that successfully grew cloud revenue from OSS bases combined three things: a metered or structured commercial self-managed option (Kong, Confluent), a frictionless bridge mechanism (Confluent Cluster Linking, MongoDB Live Migration), and cloud-exclusive features that created a widening capability gap (Atlas Search/Vector Search, Confluent Flink). Wherobots has all three components designed in. The execution risk is Kong's: doing this at 41 employees instead of 1,000.

Break-Even Calculator — FTE Cost vs Customer Revenue

Engineering cost inputs

Engineers allocated2.0

Fully loaded cost / FTE$215K

Revenue inputs

Base fee / customer$300

Avg consumption / mo$1.2K

Gross margin %80%

Monthly churn %2.5%

Annual eng. cost

—

Monthly ARPU / customer

—

Customers to break even

—

LTV (monthly GM ÷ churn)

—

Model notes: Break-even = annual engineering cost ÷ annual gross margin per customer. The chart x-axis scales to 1.5× the break-even point so the crossing is always visible. LTV = monthly gross margin ÷ monthly churn rate (simple churn model — does not account for expansion revenue). Scenario table scales consumption by a multiplier per scenario to show how blended ARPU shifts.

Revenue needed vs customers (annual)

Scenario comparison

Scenario	FTEs	Eng Cost/yr	Blended ARPU	Customers needed

TCO Calculator — Apache Sedona vs WherobotsDB Bring Your Own Spark (BYOS) vs Wherobots Cloud

Model scope: Total cost of ownership across three deployment options for a production spatial workload. Includes infrastructure, licensing, and engineering time. Capital costs are one-time or annualised setup costs. Operational costs are recurring monthly. All figures are estimates — adjust inputs to match your environment.

Workload inputs

Cluster size (cores)200

Hours running / month160

EC2 cost / core / hr (your infra)$0.053

Default: EC2 M7i ($0.21/hr ÷ 4 vCPUs = $0.053/core/hr)

BYOS workload model

Time-boxed — same job, same deadline. BYOS needs half the cores to finish in the same window. EC2 infra cost halves. Throughput-bound — same cluster, same hours. BYOS processes 2× the data in the same window. Same infra cost, double capacity.

Engineering cost inputs

Fully loaded eng cost / yr$215K

Sedona eng overhead (FTE equiv) 0.25

Spark upgrades, compatibility debugging, raster workarounds. Default 0.25 FTE.

BYOS eng overhead (FTE equiv) 0.05

Wherobots handles compatibility. Minimal internal overhead remains.

WherobotsDB Bring Your Own Spark (BYOS) pricing

Base subscription / mo$300

BYOS price / core / hr$0.04

Wherobots Cloud pricing

Source: wherobots.com/pricing
1 SU = 32 vCPUs (EC2 m7i.8xlarge equivalent = $1.61/hr EC2 on-demand).
Wherobots Cloud is all-in: no separate EC2 bill. Data transfer and managed storage free.
Adjust the SU rate below to match your contracted or AWS Marketplace rate.

AWS Region (sets published SU rate)

Source: wherobots.com/pricing · 1 SU = 32 vCPU · Data transfer + storage free

Wherobots discount 0%

Applied to list SU rate. Enter any negotiated or promotional discount you have received from Wherobots.

AI tool impact — two separate effects

Time saved / existing developer 50%

MCP Server, VS Code Extension, CLI reduce boilerplate time per developer. Default 50% per user discussion.

New users enabled (workforce multiplier) 2x

AI tools lower the skill bar — non-spatial developers can now do spatial work. 2x = double the team that can contribute.

Platform substitution savings

Est. tools / services replaced / mo $0

Cloud replaces external tools Sedona teams typically pay for separately: ML model management platforms, catalog/governance tooling, observability, curated spatial data subscriptions, and user/access management. Default $0 — enter your estimate.

Adjust inputs to see your recommendation

Apache Sedona OSS BASELINE

Infra (EC2) / mo

—

Licensing / mo

Eng overhead / mo

—

Total / mo

—

WherobotsDB BYOS ON-RAMP ~2× FASTER

Infra (EC2) / mo

—

Licensing / mo

—

Eng overhead / mo

—

Total / mo

—

Wherobots Cloud RECOMMENDED

SU compute / mo

—

Data transfer / storage

Free

Eng saved (AI tools)

—

Platform tools replaced

—

Output capacity vs OSS

—

Net cost / mo

—

AI tools (MCP Server, VS Code Extension, CLI) create two distinct effects: Existing developers spend less time writing boilerplate — that is the time-saved slider. More importantly, non-spatial developers can now do spatial work that previously required a geospatial specialist. That is the workforce multiplier: the total team that can contribute to spatial pipelines expands. AI assistants like Claude do not replace spatial developers. They remove the prerequisite of being one.

Combined capacity vs OSS team

—

Eng time saved / mo

—

Capacity-adjusted value / yr

—

Wherobots Cloud vs Sedona OSS

Direct monthly saving

—

Direct annual saving

—

Cost delta

—

Verdict

—

To match Wherobots Cloud output capacity, OSS would need:

—

FTE-equivalents of engineering + infra cost
vs. your current OSS team × — combined capacity multiplier

Annual cost to replicate Wherobots Cloud capacity with OSS

OSS equivalent cost / yr

—

Actual Cloud cost / yr

—

Capacity-adj. saving

—

BYOS vs Sedona OSS

Monthly saving

—

Annual saving

—

Cost delta

—

Verdict

—

Monthly cost breakdown — stacked by category

Infra (EC2) Licensing Eng overhead

What each option includes and hides

Sedona OSS

You manage all Spark infrastructure. Compatibility breaks on new cluster runtime versions require internal debugging. Large raster workloads require full ingestion pipelines that add hours of preparation time. Engineering time spent on Spark maintenance is the hidden cost most teams underestimate.

BYOS

You still own the EC2 infrastructure. Wherobots handles compatibility across Spark runtime versions, eliminating most internal maintenance overhead. Out-of-database raster removes ingestion pipelines for raster workloads. Licensing cost is $300/mo base plus per-core consumption.

Wherobots Cloud

No EC2 to manage. SU billing is all-in: compute, data transfer, and storage are free. Spark maintenance overhead drops to zero. The Global Hub (Data Hub, Model Hub, Spatial Catalog) replaces external catalog, model management, and data procurement tools your Sedona team would otherwise source separately. MCP Server, VS Code Extension, and CLI lower the skill bar so non-spatial developers can contribute directly. Enter an estimate of platform substitution savings above to see the full cost picture.

Competitive analysis · 17 factors · 5 groups

BYOS vs. cloud-only — Moat Scorecard

Each factor scored across 6 weighted dimensions (moat strength 25%, revenue quality 20%, strategic leverage 20%, execution risk 15%, IP defensibility 10%, cloud conversion 10%). Click any row to expand. Click a group to filter.

Runtime avg

—

out of 5.00

Cloud-only avg

—

out of 5.00

Runtime leads on

—

factors by >0.25

Cloud-only leads on

—

factors by >0.25

Score by group

BYOS deployment

Cloud-only

4.0–5.0 Strong advantage

3.0–3.9 Moderate

2.0–2.9 Weakness

1.0–1.9 Critical

Factor

BYOS

Cloud-only

Delta

Verdict

BYOS wins on

Factors for the current fundraising objective

+1.30Design partner acquisition

+1.25Customer acquisition friction

+1.15Sedona OSS control

+1.00TAM addressability

+0.90Gross margin profile

BYOS produces more logos faster, at better margin, from the Sedona community Wherobots already dominates. Right model for the fundraise.

Cloud-only wins on

Factors for long-term IP preservation

−1.95Metering agent feasibility

−1.75Feature boundary governance

−1.40Reverse engineering exposure

−1.20Managed Spark provider risk

−1.05Engineering bandwidth

Cloud-only is cleaner and more defensible. Right model if the objective shifts to platform dominance and IP preservation over near-term adoption velocity.

The tie factor

Out-db raster capability itself

0.00Identical under both models

The technical differentiation is the same regardless of deployment. The debate is how the capability reaches customers and what that exposes.

BYOS wins now. Cloud-only wins in 18–36 months.
A time-specific answer tied to the fundraising objective — not a permanent one.