5 Mins Read

23 Oct 2023

Wherobots Cloud: The Cloud-Native Spatial Analytics Data Platform

Authors

According to Gartner, 97% of data collected at the enterprise sits on the shelves without being put into use. That is a shockingly big number, especially given that the data industry got their hopes up a few years back when the Economist published their article “The most valuable resource is no longer oil, it’s data”. That is also quite surprising given the 100s of billions of dollars invested in database and analytics platforms over the past two decades.

One main reason is that data professionals most of the time struggle to connect data to use cases. A natural way for data professionals to achieve that is to link their data/insights to data about the physical world, aka.“Spatial Data”, and hence ask physical-world related questions on such data, aka. “Spatial Analytics”. This spatial approach can be an indispensable asset for businesses worldwide. Use cases range from determining optimal delivery routes to making informed decisions about property investments, to climate and agricultural technology. For instance, commercial real estate data will make more sense when connected to spatial data about nearby objects (e.g., building, POIs), man-made events (e.g, crimes, traffic), as well as natural events such as wildfires and floods. The importance of understanding the ‘where’ cannot be overstated, especially when it influences operational efficiency, customer satisfaction, and strategic planning.

The significance of spatial analytics underscores the pressing need for its efficient management within the enterprise data stack. Incumbent data platforms, often not built to handle the intricacies and scale of spatial analytics, fall short in meeting these demands. Recognizing this gap, we introduce Wherobots Cloud, a novel spatial analytics database platform. Here is a summary of features supported:

Diagram titled WherobotsDB with five boxes, a large one spanning the bottom half and four smaller ones in a line across it. The top four boxes are labeled Spatial SQL, Spatial Python, Spatial R, and Java/Scala. The lower box contains two boxes labeled Havasu and DB Connectors.

Wherobots Key Features

Linking Enterprise Data to the Spatial World

Wherobots seamlessly incorporates spatial analytics in the enterprise data stack to bring data many steps closer to use cases. Using a scalable spatial join technology, Wherobots can link customer data stored anywhere to tens of terabytes of spatial data such as maps, roads, buildings, natural events, and man-made events in a few minutes. Users can then apply spatial data processing, analytics, and AI tasks using SQL and Python on their data with unparalleled efficiency and adaptability.

To get started with Wherobots, please visit the Wherobots website.

Scalablilty

With its scalable, distributed architecture, Wherobots is redefining the way businesses handle geometry and raster spatial data processing and analytics in the cloud. Wherobots achieves that in two main ways:

Separating Compute/Storage: Wherobots uniquely separates the spatial processing and analytics layer from the data storage layer. This approach allows for optimal performance and scalability.
Distributed System Architecture: By employing a distributed system architecture, Wherobots ensures scalable out-of-core spatial computation, catering to massive datasets without compromising speed or accuracy.

Openness

Wherobots builds upon and amplifies the capabilities seen in the open-source Apache Sedona (OSS Sedona). While OSS Sedona provides foundational spatial analytics functions using spatial SQL and Python, Wherobots takes it to the next level with its faster query processing, lakehouse architecture, and its self-service yet fully-managed provisioning on Wherobots Cloud. This makes Wherobots a more comprehensive and streamlined solution for businesses. Based on our benchmarks, Wherobots is up to 10x faster than OSS Sedona for geometry data processing, and up to 20x faster than OSS Sedona for raster data processing.

Spatial Lakehouse Solution

One of the standout features of Wherobots is its support for an Apache Iceberg-compatible spatial table format, dubbed “Havasu.” This feature facilitates efficient querying and updating of geometry and raster columns on Parquet files in cloud object stores such as AWS S3. This enables spatial analytics on the sheer volume of data dumped on cloud object stores this, until today, remains seldom put to use. Details about the Havasu spatial data lake format is avaialble here

Self-service

Wherobots is provisioned as a fully-managed service within the Wherobots Cloud, ensuring that users don’t have to delve into the intricacies of managing cloud or compute resources.

By delegating resource management to Wherobots, businesses can concentrate on their core spatial analytics tasks, achieving their objectives faster, efficiently, and cost-effectively.

To use Wherobots, you first need to create an account on Wherobots Cloud. To get started, please visit the Wherobots website.

Connectivity

Wherobots comes equipped with connectors for major data storage platfoms and databases. This include cloud object stores, data warehouses like Snowflake and Redshift, lakehouses such as Databricks, and OLTP databases including Postgres / PostGIS.

Wherobots example usage

Using Wherobots, users can perform a plethora of spatial queries and analytics operations on their data. Here are some common operations users can invoke in Wherobots. For more details on these examples, please refer to the Wherobots documentation.

Insert geometry data

sedona.sql("""
INSERT INTO wherobots.test_db.test_table
VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))
""")

Insert external raster data

sedona
    .sql("SELECT RS_FromPath('s3a://XXX.tif') as rast")
    .writeTo("wherobots.test_db.test_table")
    .append()

Create a spatial index

sedona.sql("CREATE SPATIAL INDEX FOR wherobots.db.test_table USING hilbert(geom, 10)")

Read data from PostGIS

sedona.read
    .format("jdbc")
    .option("query", "SELECT id, ST_AsBinary(geom) as geom FROM my_table")
    .load()
    .withColumn("geom", f.expr("ST_GeomFromWKB(geom)"))

Read data from CSV on AWS S3

sedona.read
    .format("csv")
    .load("s3a://data.csv")

Read data from a Havasu table

sedona.table("wherobots.test_db.test_table").filter("ST_Contains(city, location) = true")
sedona.sql("SELCT * FROM wherobots.test_db.test_table WHERE ST_Contains(city, location) = true")

Spatial join query to find zones prone to wild fires

fire_zone = sedona.sql(
    """
    SELECT
        z.geometry as zipgeom,
        z.ZCTA5CE10 as zipcode,
        f.FIRE_NAME
    FROM
        wherobots_open_data.us_census.zipcode z,
        wherobots_open_data.weather.wild_fires f
    WHERE
        ST_Intersects(z.geometry, f.geometry)
    """
)

Visualize geometry data in a notebook

SedonaKepler.create_map(geometryDf)

Visualize raster data in a notebook

SedonaUtils.display_image(rasterDF.selectExpr("RS_AsImage(raster)"))

Performance Benchmark

To better showcase Wherobots’s performance, we conducted a comprehensive performance benchmark on some commonly seen spatial data processing tasks. Please download the full report of our Wherobots Performance Benchmark.

Try Wherobots for free

Get Started