Connect your AI coding assistants to the physical world with Wherobots MCP and CLI Learn More

How Agricultural Fields Change in AlphaEarth Foundations

AlphaEarth Foundations is a geospatial AI model from Google DeepMind. It compresses a year of satellite observations into 64-dimensional embedding vectors for every 10-meter pixel on Earth’s land surfaces and shallow coastal waters annually. In this notebook, we use AEF to examine how agricultural fields evolve across several regions and ask a practical question: what kinds of field-level change become visible when we inspect the embeddings directly rather than relying only on imagery?

We begin by defining a few agricultural areas of interest and building Zarr mosaics over each one from COGs available in Source Cooperative. From there, we compare several views of temporal change: the first three embedding bands rendered as RGB, a Principal Component Analysis (PCA) projection that makes broader structure easier to interpret, and an experimental distance-over-time workflow that highlights where embedding values remain stable or shift over time.

The goal is not to claim a single definitive interpretation of AEF. Instead, this notebook shows a practical workflow for exploring seasonal cycles, repeated interventions, and other field-scale dynamics in embedding space.

RasterFlow Tools Used

RasterFlow is Wherobots’ serverless inference engine for Earth Observation data. It builds inference-ready mosaics from satellite imagery and runs distributed workflows at scale. This notebook uses three RasterFlow capabilities:

  • build_gti_mosaic for constructing Zarr mosaics from AlphaEarth Foundations data
  • an experimental PCA workflow for dimensionality reduction
  • and an experimental distance-over-time workflow for measuring embedding change between timesteps.

Selecting Agricultural AOIs Across Iowa, California, and Japan

We start with three small AOIs that contain agricultural fields in different regions. This gives us a compact but varied set of examples for exploring how AlphaEarth Foundations embeddings change over time.

import geopandas as gpd
from shapely.geometry import box
from folium import Map
iowa1 = [-94.732361, 41.911475, -94.237289, 42.112996]
california1 = [-120.11095, 36.63082, -120.018082, 36.683839]
japan1 = [141.018019, 39.226668, 141.067801, 39.249803]
aois = [iowa1, california1, japan1]

gdf = gpd.GeoDataFrame({"geometry": [box(*aoi) for aoi in aois]}, crs="EPSG:4326")

# start on the first AOI
m = Map(location=gdf.iloc[0].geometry.centroid.coords[0][::-1], zoom_start=8)
gdf.explore(height="120%", width="120%", m=m)
Make this Notebook Trusted to load map: File -> Trust Notebook
# store this out to parameterize which Zarr mosaics we want to build for AEF
gdf.to_parquet("s3://union-sandbox-unionai-wherobots/aois/field_change_aef.parquet")

AlphaEarth Foundations Zarr Mosaics by AOI

The next step is to create an AlphaEarth Foundations embeddings Zarr mosaic for each AOI. This gives us a time-indexed raster store that we can query efficiently for visualization and downstream analysis. The output here is from the build_gti_mosaic RasterFlow workflow.

import geopandas as gpd
mosaic_index = gpd.read_parquet(
    "s3://union-sandbox-unionai-wherobots/ha/wherobots/wherobots-mosaics/development/rnmkrr4sbxsdf24t288p/d8cq2y2k3b1r4p9gr0k5n1ldh/1/7q/fdjxpi5q/61a6e8f637cce7ddf177ff7a6637aceb/"
)
mosaic_index
geometry location
0 POLYGON ((-94.23729 41.91148, -94.23729 42.113… s3://sandbox-wherobots-mosaics-tmp/mosaics/rnm…
1 POLYGON ((-120.01808 36.63082, -120.01808 36.6… s3://sandbox-wherobots-mosaics-tmp/mosaics/rnm…
2 POLYGON ((141.0678 39.22667, 141.0678 39.2498,… s3://sandbox-wherobots-mosaics-tmp/mosaics/rnm…

The result is one Zarr mosaic per AOI, which we can load independently for inspection and comparison.

Inspect an Example AEF Zarr Mosaic

import xarray as xr
ds = xr.open_zarr(mosaic_index.iloc[0]["location"])
ds
<xarray.Dataset> Size: 38GB
Dimensions:      (time: 9, band: 64, y: 3020, x: 5512)
Coordinates:
  * time         (time) datetime64[ns] 72B 2017-01-01 2018-01-01 ... 2025-01-01
  * band         (band) object 512B 'band_0' 'band_1' ... 'band_62' 'band_63'
  * y            (y) float64 24kB 5.178e+06 5.178e+06 ... 5.148e+06 5.148e+06
  * x            (x) float64 44kB -1.055e+07 -1.055e+07 ... -1.049e+07
    spatial_ref  int64 8B ...
Data variables:
    variables    (time, band, y, x) float32 38GB dask.array<chunksize=(1, 1, 1024, 1024), meta=np.ndarray>
# pull this into memory once to reuse
subset = ds["variables"][:, :3, :, :].compute()
import imageio.v3 as iio
import numpy as np


def to_uint8_robust(array: xr.DataArray, low: xr.DataArray, high: xr.DataArray) -> np.ndarray:
    arr = array.transpose("y", "x", ...).astype("float32")
    scale = np.where((high - low) == 0, 1.0, (high - low))
    arr = (arr - low) / scale
    arr = np.clip(arr, 0, 1)
    return (arr * 255).astype(np.uint8)


def robust_params(da: xr.DataArray) -> tuple[xr.DataArray, xr.DataArray]:
    q_low, q_high = 1, 99
    lo = da.quantile(q_low / 100, dim=("time", "y", "x"))
    hi = da.quantile(q_high / 100, dim=("time", "y", "x"))
    return lo, hi
low, high = robust_params(subset)
frames = [to_uint8_robust(subset.sel(time=t), low, high) for t in subset.time]
iio.imwrite("assets/rgb_animation.gif", frames, duration=1000, loop=0)

To establish a baseline view, we render the first three embedding bands as an RGB animation over time. The color mapping is not directly semantic, but it provides a quick way to spot recurring seasonal patterns and abrupt changes.

Animation of the first three Alpha Earth Embedding bands over time

Using PCA to Identify Seasonal Patterns in Embeddings

A more interpretable view comes from projecting the 64-dimensional embeddings into their first three principal components. PCA concentrates the dominant variance into three channels, which often makes field-level change easier to see.

# training sample for PCA using just the first year and a spatial slice
sample = ds["variables"][:, :, :1024, :1024].compute()
sample = sample.transpose("band","time","y","x").data.reshape(64, -1)
sample = sample[:, ~np.isnan(sample).any(axis=0)]
from sklearn.decomposition import PCA

model = PCA(n_components=3)
model.fit(sample.T)
PCA(n_components=3)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

To keep the notebook lightweight, we pull all time steps and bands for a small spatial subset rather than the full mosaic. AlphaEarth Foundations embeddings contain 64 bands, so this smaller window reduces data transfer while still preserving the temporal behavior we want to inspect. The same workflow can be scaled up for larger analyses.

# can take 30+ seconds
subset = ds["variables"][:, :, :1024, :2048].compute()
# Transform with PCA
flattened = subset.transpose("band", "time", "y", "x").data.reshape(64, -1).T
transformed = model.transform(flattened)
img = transformed.T.reshape(3, subset.shape[0], subset.shape[2], subset.shape[3]).transpose(
    1, 0, 2, 3
)

# Use the existing xarray data structure to build a nice home for PCA data.
subset_copy = subset.copy()
array = subset_copy.isel(band=slice(0, 3))

# insert the transformed PCA data
array.data = img
array["band"] = ["pca1", "pca2", "pca3"]

# we chunk only to get a better view of the dataset
array.to_dataset(name='variables')
<xarray.Dataset> Size: 227MB
Dimensions:      (time: 9, x: 2048, y: 1024, band: 3)
Coordinates:
  * time         (time) datetime64[ns] 72B 2017-01-01 2018-01-01 ... 2025-01-01
  * x            (x) float64 16kB -1.055e+07 -1.055e+07 ... -1.053e+07
  * y            (y) float64 8kB 5.178e+06 5.178e+06 ... 5.168e+06 5.168e+06
  * band         (band) <U4 48B 'pca1' 'pca2' 'pca3'
    spatial_ref  int64 8B 0
Data variables:
    variables    (time, band, y, x) float32 226MB 105.8 106.5 ... -27.13 -4.077
low, high = robust_params(array)
frames = [to_uint8_robust(array.sel(time=t), low, high) for t in array.time]
iio.imwrite("assets/rgb_animation_pca.gif", frames, duration=1000, loop=0)

The PCA animation makes temporal structure easier to see than the raw RGB view. Field boundaries and repeated change patterns stand out more clearly, while nearby urban areas show a different signature of structural change.

Animation of the first three PCA components of Alpha Earth Embeddings over time

Compared with the first three raw bands, the PCA view surfaces a more coherent pattern of cyclical change. In agricultural areas, the alternating shifts between green and brown tones likely correspond to crop rotations, harvest cycles, irrigation, or other recurring management practices.

PCA is still a simplification, and it comes with tradeoffs:

  • PCA is a linear transformation, so it may miss more complex relationships in the embedding space.
  • PCA can be sensitive to outliers and to the sample used to fit the model.
  • PCA adds computational overhead because fitting and applying the transform requires additional matrix operations.

Another option is to stay in the original embedding space and measure distance between embeddings over time. That gives us a direct way to ask how similar or dissimilar each pixel is from one timestep to the next.

Measure Change Directly in Embedding Space

Here we use an experimental RasterFlow workflow to compute distance over time in the original embedding space.

This workflow is not yet publicly available, but it is useful for exploring whether similarity metrics can reveal patterns that PCA may smooth over. If you are interested in this approach, please reach out to the Wherobots team (support@wherobots.com).

index = gpd.read_parquet(
    "s3://union-sandbox-unionai-wherobots/dw/wherobots/wherobots-rasterflow/development/rfzhkxdjv782gtq9xtzz/a2dwp687c37a3kc1z6593pdax/1/ae/famfddly/0fbb239f05eb805bf4128b8be91316ce/"
)

# we look at an exmple mosaic for now
ex_store = index.iloc[0]["location"]
ds = xr.open_zarr(ex_store).compute()
ds
<xarray.Dataset> Size: 533MB
Dimensions:      (time: 8, band: 1, y: 3020, x: 5512)
Coordinates:
  * time         (time) datetime64[ns] 64B 2018-01-01 2019-01-01 ... 2025-01-01
  * band         (band) object 8B 'cosine_distance'
  * y            (y) float64 24kB 5.178e+06 5.178e+06 ... 5.148e+06 5.148e+06
  * x            (x) float64 44kB -1.055e+07 -1.055e+07 ... -1.049e+07
    spatial_ref  int64 8B 0
Data variables:
    variables    (time, band, y, x) float32 533MB 0.188 0.1655 ... 0.02265
array = ds["variables"][:, :, :, :]
low = np.nanpercentile(array[0].values, 0)
high = np.nanpercentile(array[0].values, 50)
frames = [to_uint8_robust(array.sel(time=t)[0], low, high)[:, :] for t in array.time]
iio.imwrite("assets/distance_over_time_50p.gif", frames, duration=1000, loop=0)

Animation showing distance over time in Alpha Earth Embedding space

In the animation above, darker pixels indicate embeddings that remain more similar over time, while lighter pixels indicate larger temporal differences.

Interpretation is still evolving, but the visualization suggests several useful patterns.

Persistent bright pixels may mark locations with repeated or abrupt change. Alternating light-dark cycles may indicate recurring agricultural activity such as planting, harvest, or irrigation.

This does not replace domain knowledge, but it provides another lens for understanding how field-scale practices appear in AlphaEarth Foundations embeddings.

What AlphaEarth Foundations Embeddings Show About Field-Level Dynamics

AlphaEarth Foundations embeddings make it possible to examine agricultural change as a time series in embedding space rather than only as raw imagery. In this notebook, we built Zarr mosaics for several AOIs, visualized temporal variation in the first three embedding bands, projected the embeddings with PCA, and then compared that view with a distance-over-time workflow. Together, these helped draw insight into a few of the many ways AlphaEarth Foundations embeddings (and others) can provide insight into field-level dynamics.