Planetary-scale answers, unlocked.
A Hands-On Guide for Working with Large-Scale Spatial Data. Learn more.
An important requirement for data infrastructure tools like WherobotsDB and Wherobots Cloud are that they integrate well with the technology ecosystems around them. In the world of spatial databases this includes geospatial visualization tooling. Being able to create maps with data from WherobotsDB is an important use case for Wherobots Cloud, so in this blog post I wanted to explore how to create collaborative web maps with Felt, providing data using the Felt API.
For this map I wanted to integrate the Felt API with Wherobots Cloud so I could do some geospatial analysis using Spatial SQL and WherobotsDB then publish the results of my analysis to Felt’s beautiful web-based mapping tooling.
I decided to use data from BirdBuddy, which publishes data about bird sightings at its smart birdfeeders to find the range of some of my favorite bird species.
You can follow along by creating a free account on Wherobots Cloud.
Felt is a web-based tool for creating collaborative maps. Felt is bringing collaborative real-time editing functionality similar to what you’ve seen in Google Docs or Notion to the world of mapmaking. You can annotate, comment, and draw on the map, then share the results with anyone on the web you want with a single link to start collaborating.
Felt has also invested in supporting a wide range of data formats for adding data to maps with their Upload Anything tool, a simple drag and drop interface that supports formats include Shapefile, GeoJSON, GeoTiff, CSV, Jpeg, etc. Felt also built a QGIS plugin so if you’re used to working with desktop GIS tooling you can easily export your project and layers to Felt’s web-based tooling via QGIS.
Felt enables programmatically creating maps and layers as well as adding data to the map via the Felt API. We’ll be using the Felt API to upload the results of our analysis using Sedona to create and publish a map.
Our data comes from Bird Buddy which makes a smart bird feeder than can identify bird species and (optionally) report their location.
Bird Buddy publishes its data as CSV files so we’ll download the latest data and then upload the file to our Wherobots Cloud instance via the “Files” tab. The free tier of Wherobots Cloud includes free data storage in AWS S3 which we can access within the Wherobots notebook environment using the S3 URL of the file.
Once you’ve uploaded a file you can click the copy file icon to copy the file’s S3 path to access the file in the Wherobots notebook environment. Note that these files are private to your Wherobots organization, so the S3 URL below won’t be accessible to anyone outside my organization.
_URL = "s3://<YOUR_S3_URL_HERE>/birdbuddy/"
Now we’ll load the BirdBuddy CSV data and convert it to a Spatial DataFrame so we can use Spatial SQL to find the range of each species.
bb_df = sedona.read.format('csv').option('header','true').option('delimiter', ',').load(S3_URL) bb_df.show(5)
Looking at the first few rows of the DataFrame we can see we have latitude and longitude stored as seperate fields, a well as information about the bird species.
+-------------------+--------------------+----------+-----------------+----------------+ |anonymized_latitude|anonymized_longitude| timestamp| common_name| scientific_name| +-------------------+--------------------+----------+-----------------+----------------+ | 45.441235| -122.51253|2023-09...| dark eyed junco| junco h...| | 41.75291| -83.6242|2023-09...|northern cardinal|cardinalis ca...| | 43.8762| -78.9261|2023-09...|northern cardinal|cardinalis ca...| | 33.7657| -84.2951|2023-09...|northern cardinal|cardinalis ca...| | 30.4805| -84.2243|2023-09...|northern cardinal|cardinalis ca...| +-------------------+--------------------+----------+-----------------+----------------+ only showing top 5 rows
Now we’re ready to use the power of Spatial SQL to analyze our Bird Buddy data. We want to find the range of each species, but first let’s explore the data.
First we’ll convert our latitude and longitude fields into Point geometries using the ST_Point SQL function.
ST_Point
bb_df = bb_df.selectExpr('ST_Point(CAST(anonymized_longitude AS Decimal(24,20)), CAST(anonymized_latitude AS Decimal(24,20))) AS location', 'timestamp', 'common_name', 'scientific_name') bb_df.createOrReplaceTempView('bb') bb_df.show(5)
Now the location field is a proper geometry type that our Spatial DataFrame can take advantage of.
location
+--------------------+--------------------+-----------------+--------------------+ | location| timestamp| common_name| scientific_name| +--------------------+--------------------+-----------------+--------------------+ |POINT (-122.51253...|2023-09-01 00:00:...| dark eyed junco| junco hyemalis| |POINT (-83.6242 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...| |POINT (-78.9261 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...| |POINT (-84.2951 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...| |POINT (-84.2243 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...| +--------------------+--------------------+-----------------+--------------------+ only showing top 5 rows
We have just under 14 million bird observations in our DataFrame.
bb_df.count() ------------ 13972003
If we wanted to find all observations of Juncos in the data we can write a SQL query to filter the results and visualize the observations on a map using SedonaKepler, the integration for Kepler.gl
junco_df = sedona.sql("SELECT * FROM bb WHERE common_name LIKE '%junco' ") junco_df.show(5)
We used the SQL LIKE string comparision operator to find all observations relating to Juncos, then stored the results in a new DataFrame junco_df.
LIKE
junco_df
+--------------------+--------------------+---------------+---------------+ | location| timestamp| common_name|scientific_name| +--------------------+--------------------+---------------+---------------+ |POINT (-122.51253...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis| |POINT (-94.5916 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis| |POINT (-85.643 31...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis| |POINT (-87.7645 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis| |POINT (-122.16346...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis| +--------------------+--------------------+---------------+---------------+ only showing top 5 rows
Now we’ll visualize the contents of our new junco_df DataFrame using SedonaKepler.
SedonaKepler.create_map(df=junco_df, name='Juncos')
Based on the map above it looks like Juncos have quite a large range throughout North America.
Next, we’ll filter the overall dataset to a few of my favorite bird species, then use the power of Spatial SQL with a GROUP BY operation to create convex hulls (polygon geometries) from the individual observations (point geometries) of each species.
GROUP BY
By creating a convex hull around all point observations grouped by species we will create a new geometry that represents the observed range of each species in our dataset.
range_df = sedona.sql(""" SELECT common_name, COUNT(*) AS num, ST_ConvexHull(ST_Union_aggr(location)) AS geometry FROM bb WHERE common_name IN ('california towhee', 'steller’s jay', 'mountain chickadee', 'eastern bluebird') GROUP BY common_name ORDER BY num DESC """) range_df.show()
Note our use of the following Spatial SQL functions:
ST_ConvexHull
ST_Union_aggr
+------------------+-----+--------------------+ | common_name| num| geometry| +------------------+-----+--------------------+ | eastern bluebird|65971|POLYGON ((-80.345...| | steller’s jay|37864|POLYGON ((-110.26...| | california towhee|22007|POLYGON ((-117.05...| |mountain chickadee| 4102|POLYGON ((-110.99...| +------------------+-----+--------------------+
Now we have a new DataFrame range_df with 4 rows, one for each of the species we indicated in the query above. But now the geometry field is a polygon that represents the observed range of that species in our dataset. Pretty neat – let’s visualize these species ranges using Felt.
range_df
geometry
The Felt API supports file uploads in a variety of formats, but we’ll use GeoJSON. We’ll convert our Spatial DataFrame into a GeoPandas GeoDataFrame and then export to a GeoJSON file so we can upload it to the Felt API.
range_gdf = geopandas.GeoDataFrame(range_df.toPandas(), geometry="geometry") range_gdf.to_file('birdbuddy_range.geojson', driver='GeoJSON')
We’ve now created a GeoJSON file birdbuddy_range.geojson that looks a bit like this (we’ve omitted some lines):
birdbuddy_range.geojson
{ "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": { "common_name": "eastern bluebird", "num": 65971 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -80.3452, 25.6062 ], [ -98.2271, 26.2516 ], ... [ -80.3452, 25.6062 ] ] ] } }, ... ] }
If you haven’t already, create a free Felt account and then in your account settings generate a new access token so you’ll be able to create maps and upload data via the Felt API.
FELT_TOKEN = '<YOUR_TOKEN_HERE>'
To create a new map and upload data we’ll actually need to make a few network requests to the Felt API:
/maps
/maps/{map_id}/layers
map_id
/maps/{map_id}/layers/{layer_id}/finish_upload
The function below will create a new map in Felt, then create a new layer and upload our GeoJSON file to this layer. See the Felt API docs for more examples of what’s possible with the Felt API.
def create_felt_map(access_token, filename, map_title, layer_name): # First create a new map using the /maps endpoint create_map_response = requests.post( f"https://felt.com/api/v1/maps", headers={ "authorization": f"Bearer {access_token}", "content-type": "application/json", }, json={"title": map_title}, ) create_map_data = create_map_response.json() map_id = create_map_data['data']['id'] map_url = create_map_data['data']['attributes']['url'] print(create_map_data) # Next, we'll create a new layer and get a presigned upload url so we can upload our GeoJSON file layer_response = requests.post( f"https://felt.com/api/v1/maps/{map_id}/layers", headers={ "authorization": f"Bearer {access_token}", "content-type": "application/json", }, json={"file_names": [filename], "name": layer_name}, ) # This endpoint will return a pre-signed URL that we use to upload the file to Felt presigned_upload = layer_response.json() url = presigned_upload["data"]["attributes"]["url"] presigned_attributes = presigned_upload["data"]["attributes"]["presigned_attributes"] # A 204 response indicates that the upload was successful with open(filename, "rb") as file_obj: output = requests.post( url, # Order is important, file should come at the end files={**presigned_attributes, "file": file_obj}, ) layer_id = presigned_upload['data']['attributes']['layer_id'] print(output) print(layer_id) print(presigned_upload) # Finally, we call the /maps/:map_id/layers/:layer_id/finish_upload endpoint to complete the process finish_upload = requests.post( f"https://felt.com/api/v1/maps/{map_id}/layers/{layer_id}/finish_upload", headers={ "authorization": f"Bearer {access_token}", "content-type": "application/json"}, json={"filename": filename, "name": layer_name}, ) print(finish_upload.json())
Now to create a new Felt map we can call this function, passing our API token, the name of our GeoJSON file as well as what we’d like to call our new map and the data layer.
create_felt_map(FELT_TOKEN, "birdbuddy_range.geojson", "North American Bird Ranges", "My Favorite Birds")
We’ve now created a new map in Felt and uploaded our GeoJSON data as a new layer. We can share the URL with anyone on the web to view or collaborate on our map!
We can also embed the map in our Jupyter notebook:
from IPython.display import HTML HTML('<iframe width="1600" height="600" frameborder="0" title="My Favorite Bird Ranges" src="https://felt.com/embed/map/North-American-Bird-Ranges-a4c5cOCaRMiL64KK5N27TA"></iframe>"')
Every year the geospatial data and map community joins together to organize the “30 Day Map Challenge” a fun and informal challenge to create a new map and share it on social media each day for one month.
This BirdBuddy map was my 30 Day Map Challenge map for Day 3: Polygons. You can find the full Jupyter Notebook with all code on GitHub here as well as some of my other 30 Day Map Challenge maps in this repository.
Introducing RasterFlow: a planetary scale inference engine for Earth Intelligence
RasterFlow takes insights and embeddings from satellite and overhead imagery datasets into Apache Iceberg tables, with ease and efficiency at any scale.
Introducing Scalability for GeoPandas in Apache Sedona
Learn about the new GeoPandas API for Apache Sedona, now available in Wherobots. This new API allows GeoPandas developers to seamlessly scale their analysis beyond what a single compute instance can provide, unlocking insights from large-scale datasets. This integration combines the Pythonic GeoPandas API with the distributed processing power of Apache Sedona.
Wherobots brought modern infrastructure to spatial data in 2025
We’re bridging the gap between AI and data from the physical world in 2026
The Medallion Architecture for Geospatial Data: Why Spatial Intelligence Demands a Different Approach
When most data engineers hear “medallion architecture,” they think of the traditional multi-hop layering pattern that powers countless analytics pipelines. The concept is sound: progressively refine raw data into analytical data and products. But geospatial data breaks conventional data engineering in ways that demand we rethink the entire pipeline. This isn’t about just storing location […]
share this article
Awesome that you’d like to share our articles. Where would you like to share it to: