Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
AlphaEarth Foundations is a geospatial AI model from Google DeepMind. It compresses a year of satellite observations into 64-dimensional embedding vectors for every 10-meter pixel on Earth’s land surfaces and shallow coastal waters annually. In this notebook, we use AEF to examine how agricultural fields evolve across several regions and ask a practical question: what kinds of field-level change become visible when we inspect the embeddings directly rather than relying only on imagery?
We begin by defining a few agricultural areas of interest and building Zarr mosaics over each one from COGs available in Source Cooperative. From there, we compare several views of temporal change: the first three embedding bands rendered as RGB, a Principal Component Analysis (PCA) projection that makes broader structure easier to interpret, and an experimental distance-over-time workflow that highlights where embedding values remain stable or shift over time.
The goal is not to claim a single definitive interpretation of AEF. Instead, this notebook shows a practical workflow for exploring seasonal cycles, repeated interventions, and other field-scale dynamics in embedding space.
RasterFlow is Wherobots’ serverless inference engine for Earth Observation data. It builds inference-ready mosaics from satellite imagery and runs distributed workflows at scale. This notebook uses three RasterFlow capabilities:
We start with three small AOIs that contain agricultural fields in different regions. This gives us a compact but varied set of examples for exploring how AlphaEarth Foundations embeddings change over time.
import geopandas as gpd from shapely.geometry import box from folium import Map
iowa1 = [-94.732361, 41.911475, -94.237289, 42.112996] california1 = [-120.11095, 36.63082, -120.018082, 36.683839] japan1 = [141.018019, 39.226668, 141.067801, 39.249803] aois = [iowa1, california1, japan1] gdf = gpd.GeoDataFrame({"geometry": [box(*aoi) for aoi in aois]}, crs="EPSG:4326") # start on the first AOI m = Map(location=gdf.iloc[0].geometry.centroid.coords[0][::-1], zoom_start=8) gdf.explore(height="120%", width="120%", m=m)
# store this out to parameterize which Zarr mosaics we want to build for AEF gdf.to_parquet("s3://union-sandbox-unionai-wherobots/aois/field_change_aef.parquet")
The next step is to create an AlphaEarth Foundations embeddings Zarr mosaic for each AOI. This gives us a time-indexed raster store that we can query efficiently for visualization and downstream analysis. The output here is from the build_gti_mosaic RasterFlow workflow.
import geopandas as gpd
mosaic_index = gpd.read_parquet( "s3://union-sandbox-unionai-wherobots/ha/wherobots/wherobots-mosaics/development/rnmkrr4sbxsdf24t288p/d8cq2y2k3b1r4p9gr0k5n1ldh/1/7q/fdjxpi5q/61a6e8f637cce7ddf177ff7a6637aceb/" ) mosaic_index
The result is one Zarr mosaic per AOI, which we can load independently for inspection and comparison.
import xarray as xr
ds = xr.open_zarr(mosaic_index.iloc[0]["location"]) ds
<xarray.Dataset> Size: 38GB Dimensions: (time: 9, band: 64, y: 3020, x: 5512) Coordinates: * time (time) datetime64[ns] 72B 2017-01-01 2018-01-01 ... 2025-01-01 * band (band) object 512B 'band_0' 'band_1' ... 'band_62' 'band_63' * y (y) float64 24kB 5.178e+06 5.178e+06 ... 5.148e+06 5.148e+06 * x (x) float64 44kB -1.055e+07 -1.055e+07 ... -1.049e+07 spatial_ref int64 8B ... Data variables: variables (time, band, y, x) float32 38GB dask.array<chunksize=(1, 1, 1024, 1024), meta=np.ndarray>
array(['2017-01-01T00:00:00.000000000', '2018-01-01T00:00:00.000000000', '2019-01-01T00:00:00.000000000', '2020-01-01T00:00:00.000000000', '2021-01-01T00:00:00.000000000', '2022-01-01T00:00:00.000000000', '2023-01-01T00:00:00.000000000', '2024-01-01T00:00:00.000000000', '2025-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
array(['band_0', 'band_1', 'band_2', 'band_3', 'band_4', 'band_5', 'band_6', 'band_7', 'band_8', 'band_9', 'band_10', 'band_11', 'band_12', 'band_13', 'band_14', 'band_15', 'band_16', 'band_17', 'band_18', 'band_19', 'band_20', 'band_21', 'band_22', 'band_23', 'band_24', 'band_25', 'band_26', 'band_27', 'band_28', 'band_29', 'band_30', 'band_31', 'band_32', 'band_33', 'band_34', 'band_35', 'band_36', 'band_37', 'band_38', 'band_39', 'band_40', 'band_41', 'band_42', 'band_43', 'band_44', 'band_45', 'band_46', 'band_47', 'band_48', 'band_49', 'band_50', 'band_51', 'band_52', 'band_53', 'band_54', 'band_55', 'band_56', 'band_57', 'band_58', 'band_59', 'band_60', 'band_61', 'band_62', 'band_63'], dtype=object)
array([5177915.753919, 5177905.753919, 5177895.753919, ..., 5147745.753919, 5147735.753919, 5147725.753919], shape=(3020,))
array([-10545553.188165, -10545543.188165, -10545533.188165, ..., -10490463.188165, -10490453.188165, -10490443.188165], shape=(5512,))
[1 values with dtype=int64]
# pull this into memory once to reuse subset = ds["variables"][:, :3, :, :].compute()
import imageio.v3 as iio import numpy as np def to_uint8_robust(array: xr.DataArray, low: xr.DataArray, high: xr.DataArray) -> np.ndarray: arr = array.transpose("y", "x", ...).astype("float32") scale = np.where((high - low) == 0, 1.0, (high - low)) arr = (arr - low) / scale arr = np.clip(arr, 0, 1) return (arr * 255).astype(np.uint8) def robust_params(da: xr.DataArray) -> tuple[xr.DataArray, xr.DataArray]: q_low, q_high = 1, 99 lo = da.quantile(q_low / 100, dim=("time", "y", "x")) hi = da.quantile(q_high / 100, dim=("time", "y", "x")) return lo, hi
low, high = robust_params(subset) frames = [to_uint8_robust(subset.sel(time=t), low, high) for t in subset.time] iio.imwrite("assets/rgb_animation.gif", frames, duration=1000, loop=0)
To establish a baseline view, we render the first three embedding bands as an RGB animation over time. The color mapping is not directly semantic, but it provides a quick way to spot recurring seasonal patterns and abrupt changes.
A more interpretable view comes from projecting the 64-dimensional embeddings into their first three principal components. PCA concentrates the dominant variance into three channels, which often makes field-level change easier to see.
# training sample for PCA using just the first year and a spatial slice sample = ds["variables"][:, :, :1024, :1024].compute() sample = sample.transpose("band","time","y","x").data.reshape(64, -1) sample = sample[:, ~np.isnan(sample).any(axis=0)]
from sklearn.decomposition import PCA model = PCA(n_components=3) model.fit(sample.T)
PCA(n_components=3)
To keep the notebook lightweight, we pull all time steps and bands for a small spatial subset rather than the full mosaic. AlphaEarth Foundations embeddings contain 64 bands, so this smaller window reduces data transfer while still preserving the temporal behavior we want to inspect. The same workflow can be scaled up for larger analyses.
# can take 30+ seconds subset = ds["variables"][:, :, :1024, :2048].compute()
# Transform with PCA flattened = subset.transpose("band", "time", "y", "x").data.reshape(64, -1).T transformed = model.transform(flattened) img = transformed.T.reshape(3, subset.shape[0], subset.shape[2], subset.shape[3]).transpose( 1, 0, 2, 3 ) # Use the existing xarray data structure to build a nice home for PCA data. subset_copy = subset.copy() array = subset_copy.isel(band=slice(0, 3)) # insert the transformed PCA data array.data = img array["band"] = ["pca1", "pca2", "pca3"] # we chunk only to get a better view of the dataset array.to_dataset(name='variables')
<xarray.Dataset> Size: 227MB Dimensions: (time: 9, x: 2048, y: 1024, band: 3) Coordinates: * time (time) datetime64[ns] 72B 2017-01-01 2018-01-01 ... 2025-01-01 * x (x) float64 16kB -1.055e+07 -1.055e+07 ... -1.053e+07 * y (y) float64 8kB 5.178e+06 5.178e+06 ... 5.168e+06 5.168e+06 * band (band) <U4 48B 'pca1' 'pca2' 'pca3' spatial_ref int64 8B 0 Data variables: variables (time, band, y, x) float32 226MB 105.8 106.5 ... -27.13 -4.077
array([-10545553.188165, -10545543.188165, -10545533.188165, ..., -10525103.188165, -10525093.188165, -10525083.188165], shape=(2048,))
array([5177915.753919, 5177905.753919, 5177895.753919, ..., 5167705.753919, 5167695.753919, 5167685.753919], shape=(1024,))
array(['pca1', 'pca2', 'pca3'], dtype='<U4')
array(0)
array([[[[ 1.05770493e+02, 1.06525818e+02, 1.05133469e+02, ..., 1.30700256e+02, 1.30074829e+02, 1.27931900e+02], [ 6.08201904e+01, 6.29569931e+01, 6.25954361e+01, ..., 1.28005692e+02, 1.27422653e+02, 1.24593033e+02], [ 5.44661789e+01, 5.69338531e+01, 5.75790405e+01, ..., 1.21588974e+02, 1.20458359e+02, 1.17762802e+02], ..., [ 1.51214890e+02, 1.52101990e+02, 1.52024200e+02, ..., -1.18089600e+02, -1.20491562e+02, -1.31861267e+02], [ 1.51108643e+02, 1.51166306e+02, 1.50360260e+02, ..., -1.18652443e+02, -1.20392769e+02, -1.32454071e+02], [ 1.51752594e+02, 1.51912766e+02, 1.51645416e+02, ..., -1.14830399e+02, -1.16933640e+02, -1.29588287e+02]], [[ 6.51836700e+01, 6.41575317e+01, 6.36946754e+01, ..., 1.66106911e+01, 1.59068413e+01, 1.49540596e+01], [ 7.97899323e+01, 7.65455933e+01, 7.61807556e+01, ..., 1.71908302e+01, 1.67538795e+01, 1.62054939e+01], [ 8.03892975e+01, 7.73186646e+01, 7.53374481e+01, ..., 1.97710686e+01, 2.10001564e+01, 2.09486656e+01], ... -1.97053558e+02, -1.97430145e+02, -1.92435471e+02], [ 5.55389061e+01, 5.58998604e+01, 5.56245155e+01, ..., -1.97187149e+02, -1.98246323e+02, -1.93215683e+02], [ 5.47693748e+01, 5.49577599e+01, 5.39914284e+01, ..., -1.96610718e+02, -1.99787994e+02, -1.99223740e+02]], [[ 1.27576866e+01, 4.09418488e+00, 2.94165802e+00, ..., -5.68252563e-01, -8.00636292e-01, -8.35319519e-01], [ 1.52739182e+01, 1.24224472e+01, 1.14384575e+01, ..., -4.63180542e-01, -1.17514038e+00, -2.06540680e+00], [ 1.39552650e+01, 1.15149002e+01, 1.13042603e+01, ..., -1.52838898e+00, -2.34008026e+00, -3.25760651e+00], ..., [ 1.41902199e+01, 1.48776703e+01, 1.45605927e+01, ..., -1.11963120e+01, -1.61422577e+01, 5.11234283e+00], [ 1.08453636e+01, 1.15880547e+01, 1.06670532e+01, ..., -1.37364197e+01, -1.82680054e+01, 2.96572113e+00], [ 1.08456039e+01, 1.12658768e+01, 1.04980927e+01, ..., -2.48144073e+01, -2.71259003e+01, -4.07700348e+00]]]], shape=(9, 3, 1024, 2048), dtype=float32)
low, high = robust_params(array) frames = [to_uint8_robust(array.sel(time=t), low, high) for t in array.time] iio.imwrite("assets/rgb_animation_pca.gif", frames, duration=1000, loop=0)
The PCA animation makes temporal structure easier to see than the raw RGB view. Field boundaries and repeated change patterns stand out more clearly, while nearby urban areas show a different signature of structural change.
Compared with the first three raw bands, the PCA view surfaces a more coherent pattern of cyclical change. In agricultural areas, the alternating shifts between green and brown tones likely correspond to crop rotations, harvest cycles, irrigation, or other recurring management practices.
PCA is still a simplification, and it comes with tradeoffs:
Another option is to stay in the original embedding space and measure distance between embeddings over time. That gives us a direct way to ask how similar or dissimilar each pixel is from one timestep to the next.
Here we use an experimental RasterFlow workflow to compute distance over time in the original embedding space.
This workflow is not yet publicly available, but it is useful for exploring whether similarity metrics can reveal patterns that PCA may smooth over. If you are interested in this approach, please reach out to the Wherobots team (support@wherobots.com).
index = gpd.read_parquet( "s3://union-sandbox-unionai-wherobots/dw/wherobots/wherobots-rasterflow/development/rfzhkxdjv782gtq9xtzz/a2dwp687c37a3kc1z6593pdax/1/ae/famfddly/0fbb239f05eb805bf4128b8be91316ce/" ) # we look at an exmple mosaic for now ex_store = index.iloc[0]["location"] ds = xr.open_zarr(ex_store).compute() ds
<xarray.Dataset> Size: 533MB Dimensions: (time: 8, band: 1, y: 3020, x: 5512) Coordinates: * time (time) datetime64[ns] 64B 2018-01-01 2019-01-01 ... 2025-01-01 * band (band) object 8B 'cosine_distance' * y (y) float64 24kB 5.178e+06 5.178e+06 ... 5.148e+06 5.148e+06 * x (x) float64 44kB -1.055e+07 -1.055e+07 ... -1.049e+07 spatial_ref int64 8B 0 Data variables: variables (time, band, y, x) float32 533MB 0.188 0.1655 ... 0.02265
array(['2018-01-01T00:00:00.000000000', '2019-01-01T00:00:00.000000000', '2020-01-01T00:00:00.000000000', '2021-01-01T00:00:00.000000000', '2022-01-01T00:00:00.000000000', '2023-01-01T00:00:00.000000000', '2024-01-01T00:00:00.000000000', '2025-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
array(['cosine_distance'], dtype=object)
array([[[[0.18804508, 0.16548318, 0.1549539 , ..., 0.6547667 , 0.6600232 , 0.6827302 ], [0.09239805, 0.08352977, 0.07957667, ..., 0.65431345, 0.6571518 , 0.6796979 ], [0.07838935, 0.07207382, 0.07041121, ..., 0.64434373, 0.643643 , 0.67302835], ..., [0.0281769 , 0.02823567, 0.02864486, ..., 0.32737243, 0.3262853 , 0.30775493], [0.04068887, 0.04150879, 0.03881335, ..., 0.3269202 , 0.3267812 , 0.31013072], [0.04347068, 0.04405069, 0.04101866, ..., 0.3367141 , 0.33538705, 0.3183689 ]]], [[[0.20149559, 0.17581868, 0.16398937, ..., 0.4470787 , 0.448758 , 0.45887315], [0.09369123, 0.08462119, 0.07994598, ..., 0.44638008, 0.44314015, 0.45811075], [0.08294475, 0.07634479, 0.07309014, ..., 0.44814014, ... [0.04521638, 0.04491764, 0.04381305, ..., 0.46263814, 0.45893395, 0.4046142 ], [0.04455835, 0.04325521, 0.04292834, ..., 0.46341467, 0.4581054 , 0.40957177]]], [[[0.49459922, 0.46871775, 0.4611882 , ..., 0.1296547 , 0.12676072, 0.12663496], [0.4966467 , 0.4909042 , 0.4853648 , ..., 0.1275351 , 0.1253922 , 0.1257289 ], [0.4927944 , 0.4910075 , 0.48507315, ..., 0.12276459, 0.11952102, 0.12123233], ..., [0.08114195, 0.08070195, 0.08397031, ..., 0.01797432, 0.01881957, 0.02318203], [0.08387047, 0.0837459 , 0.08279037, ..., 0.01825058, 0.01814717, 0.02346212], [0.08379203, 0.08498359, 0.08619195, ..., 0.01759416, 0.01863509, 0.02264881]]]], shape=(8, 1, 3020, 5512), dtype=float32)
array = ds["variables"][:, :, :, :] low = np.nanpercentile(array[0].values, 0) high = np.nanpercentile(array[0].values, 50) frames = [to_uint8_robust(array.sel(time=t)[0], low, high)[:, :] for t in array.time] iio.imwrite("assets/distance_over_time_50p.gif", frames, duration=1000, loop=0)
In the animation above, darker pixels indicate embeddings that remain more similar over time, while lighter pixels indicate larger temporal differences.
Interpretation is still evolving, but the visualization suggests several useful patterns.
Persistent bright pixels may mark locations with repeated or abrupt change. Alternating light-dark cycles may indicate recurring agricultural activity such as planting, harvest, or irrigation.
This does not replace domain knowledge, but it provides another lens for understanding how field-scale practices appear in AlphaEarth Foundations embeddings.
AlphaEarth Foundations embeddings make it possible to examine agricultural change as a time series in embedding space rather than only as raw imagery. In this notebook, we built Zarr mosaics for several AOIs, visualized temporal variation in the first three embedding bands, projected the embeddings with PCA, and then compared that view with a distance-over-time workflow. Together, these helped draw insight into a few of the many ways AlphaEarth Foundations embeddings (and others) can provide insight into field-level dynamics.
How We Delivered “Fields of The World” with RasterFlow: A Planetary-Scale GeoAI Pipeline
See how we used RasterFlow to run a 100TB+ global GeoAI pipeline, from feature mosaics to predictions and vectors, with reproducible workflows.
Change Detection Using AlphaEarth Foundations (Part 2)
Continue exploring how Alpha Earth Embeddings reveal change over time using scores.
AlphaEarth Embeddings, Zonal Statistics, and PCA
Aggregate AlphaEarth embeddings over Iowa fields and visualize them with PCA.
Introducing the Wherobots Python SDK
What is the Wherobots Python SDK? The Wherobots Python SDK is a typed Python client for submitting, monitoring, and managing Wherobots job runs. It ships on PyPI as wherobots-python-sdk. One install, one API key, and you’re running spatial jobs from any Python environment: CI/CD pipelines, notebooks, a local shell. The SDK is built for three […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: