7 Mins Read

24 Jun 2024

Unlock Satellite Imagery Insights with WherobotsAI Raster Inference

Authors

Recently we introduced WherobotsAI Raster Inference to unlock analytics on satellite and aerial imagery using SQL or Python. Raster Inference simplifies extracting insights from satellite and aerial imagery using SQL or Python, and is powered by open-source machine learning models. Below we’ll dig into the popular computer vision tasks that Raster Inference supports, describe how it works, and how you can use it to run batch inference to find and map electricity infrastructure.

Watch the live demo of these capabilities here.

The Power of Machine Learning with Satellite Imagery

Petabytes of satellite imagery are generated each day all over the world in a dizzying number of sensor types and image resolutions. The applications for satellite imagery and other remote sensing data sources are broad and diverse. For example, satellites with consistent, continuous orbits are ideal for monitoring forest carbon stocks to validate carbon credits or estimating agricultural yields.

However, this data has been inaccessible for most analysts and even seasoned ML practitioners because insight extraction required specialized skills. We’ve done the work to make insight extraction simple and accessible to more people. Raster Inference abstracts the complexity and scales to support planetary-scale imagery datasets, so you don’t need ML expertise to derive insights. In this blog, we explore the key features that make Raster Inference effective for land cover classification, solar farm mapping, and marine infrastructure detection. And, in the near future, you will be able to use Raster Inference with your own models!

Introduction to Popular and Supported Machine Learning Tasks

Raster Inference supports the three most common kinds of computer vision models that are applied to imagery: classification, object detection, and semantic segmentation. Instance segmentation (combines object localization and semantic segmentation) is another common type of model which is not currently supported, but let us know if you need by contacting us and we can add it to the roadmap.

Computer Vision Detection Types — Computer Vision Detection Categories from Lin et al. Microsoft COCO: Common Objects in Context

The figure above illustrates these tasks. Image classification is when an image is assigned one or more text labels. In image (a), the scene is assigned the labels “person”, “sheep”, and “dog”. Image (b) is an example of object localization (or object detection). Object localization creates bounding boxes around objects of interest and assigns labels. In this image, five sheep are localized separately along with one human and one dog. Finally, semantic segmentation is when each pixel is given a category label, as shown in image (c). Here we can see all the pixels belonging to sheep are labeled blue, the dog is labeled red, and the person is labeled teal.

While these examples highlight detection tasks on regular imagery, these computer vision models can be applied to raster formatted imagery. Raster data formats are the most common data formats for satellite and aerial imagery. When objects of interest in raster imagery are localized, their bounding boxes can be georeferenced, which means that each pixel is localized to spatial coordinates, such as latitude and longitude. Therefore, georeferencing is object localization suited for spatial analytics.

https://wherobots.com/wp-content/uploads/2024/06/remotesensing-11-00339-g005.png

The example above shows various applications of object detection for localizing and classifying features in high resolution satellite and aerial imagery. This example comes from DOTA, a 15-class dataset of different objects in RGB and grayscale satellite imagery. Public datasets like DOTA are used to develop and benchmark machine learning models.

Not only are there many publicly available object detection models, but also there are many semantic segmentation models.

Not every machine learning model should be treated equally, and each will have their own tradeoffs. You can see the difference between the ground truth image (human annotated buildings representing the real world) and segmentation results across two models (Scale-MAE and Vanilla MAE). These results are derived from the same image at two different resolutions (referred to as GSD, or Ground Sampling Distance).

Scale-MAE is a model developed to handle detection tasks at various resolutions with different sensor inputs. It uses a similar MAE model architecture as the Vanilla MAE, but is trained specifically for detection tasks on overhead imagery that span different resolutions.
The Vanilla MAE is not trained to handle varying resolutions in overhead imagery. It’s performance suffers in the top row and especially the bottom row, where resolution is coarser, as seen by the mismatch between Vanilla MAE and the ground truth image where many pixels are incorrectly classified.

Satellite Analytics Before Raster Inference

Without Raster Inference, typically a team who is looking to extract insights from overhead imagery using ML would need to:

Deploy a distributed runtime to scale out workloads such as data loading, preprocessing, and inference.
Develop functionality to operate on raster metadata to easily filter it by location to run inference workloads on specific areas of interest.
Optimize models to run performantly on GPUs, which can involve complex rewrites of the underlying model prediction logic.
Create and manage data preprocessing pipelines to normalize, resize, and collate raster imagery into the correct data type and size required by the model.
Develop the logic to run data loading, preprocessing, and model inference efficiently at scale.

Raster Inference and its SQL and Python APIs abstract this complexity so you and your team can easily perform inference on massive raster datasets.

Raster Inference APIs for SQL and Python

Raster Inference offers APIs in both SQL and Python to run inference tasks. These APIs are designed to be easy to use, even if you’re not a machine learning expert. RS_CLASSIFY can be used for scene classification, RS_BBOXES_DETECT for object detection, and RS_SEGMENT for semantic segmentation. Each function produces tabular results which can be georeferenced either for the scene, object, or segmentation depending on the function. The records can be joined or visualized with other data (geospatial or traditional) to curate enriched datasets and insights. Here are SQL and Python examples for RS_Segment.

RS_SEGMENT('{model_id}', outdb_raster) AS segment_result

df = df_raster_input.withColumn("segment_result", rs_segment(model_id, col("outdb_raster")))

Example: Mapping Electricity Infrastructure

Imagine you want to optimize the location of new EV charging stations, but you want to target locations based on the availability of green energy sources, such as local solar farms. You can use Raster Inference to detect and locate solar farms and cross-reference these locations with internal data or other vector geometries that captures demand for EV charging. This use case will be demonstrated in our upcoming release webinar on July 10th.

Let’s walk through how to use Raster Inference for this use case.

First, we run predictions on rasters to find solar farms. The following code block that calls RS_SEGMENT shows how easy this is.

CREATE OR REPLACE TEMP VIEW segment_fields AS (
    SELECT
        outdb_raster,
        RS_SEGMENT('{model_id}', outdb_raster) AS segment_result
    FROM
    az_high_demand_with_scene
)

The confidence_array column produced from RS_SEGMENT can be assigned the same geospatial coordinates as the raster input and converted to a vector that can be spatially joined and processed with WherobotsDB using RS_SEGMENT_TO_GEOMS. We select a confidence threshold of .65 so that we only georeference high confidence detections.

WITH t AS (
        SELECT RS_SEGMENT_TO_GEOMS(outdb_raster, confidence_array, array(1), class_map, 0.65) result
        FROM predictions_df
    )
    SELECT result.* FROM t

+----------+--------------------+--------------------+
|     class|avg_confidence_score|            geometry|
+----------+--------------------+--------------------+
|Solar Farm|  0.7205783606825462|MULTIPOLYGON (((-...|
|Solar Farm|  0.7273308333550763|MULTIPOLYGON (((-...|
|Solar Farm|  0.7301468510823231|MULTIPOLYGON (((-...|
|Solar Farm|  0.7180177244988899|MULTIPOLYGON (((-...|
|Solar Farm|   0.728077805771141|MULTIPOLYGON (((-...|
|Solar Farm|     0.7264981572898|MULTIPOLYGON (((-...|
|Solar Farm|  0.7044100126912517|MULTIPOLYGON (((-...|
|Solar Farm|  0.7137283466756343|MULTIPOLYGON (((-...|
+----------+--------------------+--------------------+

This allows us to integrate the vectorized model predictions with other spatial datasets and easily visualize the results with SedonaKepler.

https://wherobots.com/wp-content/uploads/2024/06/solar_farm_detection-1-1024x398.png

Here Raster Inference runs on a 85 GiB dataset with 2,200 raster scenes for Arizona. Using a Sedona (tiny) runtime, Raster Inference completed in 430 seconds, predicting solar farms for all low cloud cover satellite images for the state of Arizona for the month of October. If we scale up our runtime to a San Francisco (small) runtime, the inference speed nearly doubles. In general, average bytes processed per second by Wherobots increases as datasets scale in size because startup costs are amortized over time. Processing speed also increases as runtimes scale in size.

Inference time (seconds)	Runtime Size
430	Sedona
246	San Francisco

We use predictions from the output of Raster Inference to derive insights about which zip codes have the most solar farms, as shown below. This statement joins predicted solar farms with zip codes by location, then ranks zip codes by the pre-computed solar farm area within each zip code. We skipped this step for brevity but you can see it and others in the notebook example.

az_solar_zip_codes = sedona.sql("""
SELECT solar_area, any_value(az_zta5.geometry) AS geometry, ZCTA5CE10
FROM predictions_polys JOIN az_zta5
WHERE ST_Intersects(az_zta5.geometry, predictions_polys.geometry)
GROUP BY ZCTA5CE10
ORDER BY solar_area DESC
""")

https://wherobots.com/wp-content/uploads/2024/06/final_analysis.png

These predictions are made possible by SATLAS, a family of machine learning models released with Apache 2.0 licensing from Allen AI. The solar model demonstrated above was derived from the SATLAS foundational model. This foundational model can be used as a building block to create models to address specific detection challenges like solar farm detection. Additionally, there are many other open source machine learning models available for deriving insights from satellite imagery, many of which are provided by the TorchGeo project. We are just beginning to explore what these models can achieve for planetary-scale monitoring.

If you have a specific model you would like to see made available, please contact us to let us know.

For detailed instructions on using Raster Inference, please refer to our example Jupyter notebooks in the documentation.

https://wherobots.com/wp-content/uploads/2024/06/Screenshot_2024-06-08_at_2.11.07_PM-1024x683.png

Here are some links to get you started:
https://docs.wherobots.com/latest/tutorials/wherobotsai/wherobots-inference/segmentation/

https://docs.wherobots.com/latest/api/wherobots-inference/pythondoc/inference/sql_functions

Getting Started

Getting started with WherobotsAI Raster Inference is easy. We’ve provided three models in Wherobots Cloud that can be used with our GPU optimized runtimes. Sign up for Wherobots Pro via the AWS Marketplace and get up to $400 in free credits. You can test out the Pro features, including Raster Inference.

Stay tuned for updates on improvements to Raster Inference that will make it possible to run more models, including your own custom models. We’re excited to hear what models you’d like us to support, or the integrations you need to make running your own models even easier with Raster Inference. We can’t wait for your feedback and to see what you’ll create!

Try Wherobots Pro for Free

Get Started

TABLE OF CONTENTS

The Power of Machine Learning with Satellite Imagery
Introduction to Popular and Supported Machine Learning Tasks
Satellite Analytics Before Raster Inference
Raster Inference APIs for SQL and Python
Example: Mapping Electricity Infrastructure
Getting Started

Contributors

Ryan Avery

Ryan is a Senior Machine Learning Engineer developing standards for deploying ML models on geospatial imagery and building Wherobots AI Raster Inference.
Damian Wylie

I lead product for Wherobots. When I'm not building I'm likely outside on a river, mountain, or trail, and enabling my two kids to enjoy nature as much as I do.

7 Mins Read 10 Dec 2025

Introducing RasterFlow: a planetary scale inference engine for Earth Intelligence

RasterFlow takes insights and embeddings from satellite and overhead imagery datasets into Apache Iceberg tables, with ease and efficiency at any scale.

Computer Vision + 4

4 Mins Read 19 Feb 2026

It takes 15 minutes for the Caltrain to get from Sunnyvale to SAP Center

That’s how long it took our MCP server to go from “how many bus stops are in Maryland” to an answer

General + 2

4 Mins Read 10 Feb 2026

Wherobots and Felt Partner to Modernize Spatial Intelligence

We’re excited to announce Wherobots and Felt are partnering to enable data teams to innovate with physical world data and move beyond legacy GIS, using the modern spatial intelligence stack. The stack with Wherobots and Felt provides a cloud-native, spatial processing and collaborative mapping solution that accelerates innovation and time-to-insight across an organization. What is […]

General + 1