TABLE OF CONTENTS

    An important requirement for data infrastructure tools like WherobotsDB and Wherobots Cloud are that they integrate well with the technology ecosystems around them. In the world of spatial databases this includes geospatial visualization tooling. Being able to create maps with data from WherobotsDB is an important usecase for Wherobots Cloud so in this blog post I wanted to explore how to create collaborative web maps with Felt, providing data using the Felt API.

    For this map I wanted to integrate the Felt API with Wherobots Cloud so I could do some geospatial analysis using Spatial SQL and WherobotsDB then publish the results of my analysis to Felt’s beautiful web-based mapping tooling.

    I decided to use data from BirdBuddy, which publishes data about bird sightings at its smart birdfeeders to find the range of some of my favorite bird species.

    North American bird ranges

    You can follow along by creating a free account on Wherobots Cloud.

    Collaborative Web Maps With Felt

    Felt is a web-based tool for creating collaborative maps. Felt is bringing collaborative real-time editing functionality similar to what you’ve seen in Google Docs or Notion to the world of mapmaking. You can annotate, comment, and draw on the map, then share the results with anyone on the web you want with a single link to start collaborating.

    Felt has also invested in supporting a wide range of data formats for adding data to maps with their Upload Anything tool, a simple drag and drop interface that supports formats include Shapefile, GeoJSON, GeoTiff, CSV, Jpeg, etc. Felt also built a QGIS plugin so if you’re used to working with desktop GIS tooling you can easily export your project and layers to Felt’s web-based tooling via QGIS.

    Felt enables programmatically creating maps and layers as well as adding data to the map via the Felt API. We’ll be using the Felt API to upload the results of our analysis using Sedona to create and publish a map.

    Wherobots Cloud File Management

    Our data comes from Bird Buddy which makes a smart bird feeder than can identify bird species and (optionally) report their location.

    BirdBuddy screenshot

    Bird Buddy publishes its data as CSV files so we’ll download the latest data and then upload the file to our Wherobots Cloud instance via the "Files" tab. The free tier of Wherobots Cloud includes free data storage in AWS S3 which we can access within the Wherobots notebook environment using the S3 URL of the file.

    Wherobots Cloud file management

    Once you’ve uploaded a file you can click the copy file icon to copy the file’s S3 path to access the file in the Wherobots notebook environment. Note that these files are private to your Wherobots organization, so the S3 URL below won’t be accessible to anyone outside my organization.

    S3_URL = "s3://<YOUR_S3_URL_HERE>/birdbuddy/"

    Now we’ll load the BirdBuddy CSV data and convert it to a Spatial DataFrame so we can use Spatial SQL to find the range of each species.

    bb_df = sedona.read.format('csv').option('header','true').option('delimiter', ',').load(S3_URL)
    bb_df.show(5)

    Looking at the first few rows of the DataFrame we can see we have latitude and longitude stored as seperate fields, a well as information about the bird species.

    +-------------------+--------------------+----------+-----------------+----------------+
    |anonymized_latitude|anonymized_longitude| timestamp|      common_name| scientific_name|
    +-------------------+--------------------+----------+-----------------+----------------+
    |          45.441235|          -122.51253|2023-09...|  dark eyed junco|      junco h...|
    |           41.75291|            -83.6242|2023-09...|northern cardinal|cardinalis ca...|
    |            43.8762|            -78.9261|2023-09...|northern cardinal|cardinalis ca...|
    |            33.7657|            -84.2951|2023-09...|northern cardinal|cardinalis ca...|
    |            30.4805|            -84.2243|2023-09...|northern cardinal|cardinalis ca...|
    +-------------------+--------------------+----------+-----------------+----------------+
    only showing top 5 rows

    Spatial SQL With WherobotsDB

    Now we’re ready to use the power of Spatial SQL to analyze our Bird Buddy data. We want to find the range of each species, but first let’s explore the data.

    First we’ll convert our latitude and longitude fields into Point geometries using the ST_Point SQL function.

    bb_df = bb_df.selectExpr('ST_Point(CAST(anonymized_longitude AS Decimal(24,20)), CAST(anonymized_latitude AS Decimal(24,20))) AS location', 'timestamp', 'common_name', 'scientific_name')
    bb_df.createOrReplaceTempView('bb')
    bb_df.show(5)

    Now the location field is a proper geometry type that our Spatial DataFrame can take advantage of.

    +--------------------+--------------------+-----------------+--------------------+
    |            location|           timestamp|      common_name|     scientific_name|
    +--------------------+--------------------+-----------------+--------------------+
    |POINT (-122.51253...|2023-09-01 00:00:...|  dark eyed junco|      junco hyemalis|
    |POINT (-83.6242 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
    |POINT (-78.9261 4...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
    |POINT (-84.2951 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
    |POINT (-84.2243 3...|2023-09-01 00:00:...|northern cardinal|cardinalis cardin...|
    +--------------------+--------------------+-----------------+--------------------+
    only showing top 5 rows

    We have just under 14 million bird observations in our DataFrame.

    bb_df.count()
    ------------
    13972003

    If we wanted to find all observations of Juncos in the data we can write a SQL query to filter the results and visualize the observations on a map using SedonaKepler, the integration for Kepler.gl

    junco_df = sedona.sql("SELECT * FROM bb WHERE common_name LIKE '%junco' ")
    junco_df.show(5)

    We used the SQL LIKE string comparision operator to find all observations relating to Juncos, then stored the results in a new DataFrame junco_df.

    +--------------------+--------------------+---------------+---------------+
    |            location|           timestamp|    common_name|scientific_name|
    +--------------------+--------------------+---------------+---------------+
    |POINT (-122.51253...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
    |POINT (-94.5916 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
    |POINT (-85.643 31...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
    |POINT (-87.7645 3...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
    |POINT (-122.16346...|2023-09-01 00:00:...|dark eyed junco| junco hyemalis|
    +--------------------+--------------------+---------------+---------------+
    only showing top 5 rows

    Now we’ll visualize the contents of our new junco_df DataFrame using SedonaKepler.

    SedonaKepler.create_map(df=junco_df, name='Juncos')

    Juncos observation map

    Based on the map above it looks like Juncos have quite a large range throughout North America.

    Next, we’ll filter the overall dataset to a few of my favorite bird species, then use the power of Spatial SQL with a GROUP BY operation to create convex hulls (polygon geometries) from the individual observations (point geometries) of each species.

    By creating a convex hull around all point observations grouped by species we will create a new geometry that represents the observed range of each species in our dataset.

    range_df = sedona.sql("""
        SELECT common_name, COUNT(*) AS num, ST_ConvexHull(ST_Union_aggr(location)) AS geometry 
        FROM bb 
        WHERE common_name IN ('california towhee', 'steller’s jay', 'mountain chickadee', 'eastern bluebird') 
        GROUP BY common_name 
        ORDER BY num DESC
    """)
    range_df.show()

    Note our use of the following Spatial SQL functions:

    • ST_ConvexHull – given multiple point geometries, return a polygon geometry of an area that contains all points in a convex hull
    • ST_Union_aggr – an aggregating function that will collect multiple geometries, in this case used alongside a GROUP BY
    +------------------+-----+--------------------+
    |       common_name|  num|            geometry|
    +------------------+-----+--------------------+
    |  eastern bluebird|65971|POLYGON ((-80.345...|
    |     steller’s jay|37864|POLYGON ((-110.26...|
    | california towhee|22007|POLYGON ((-117.05...|
    |mountain chickadee| 4102|POLYGON ((-110.99...|
    +------------------+-----+--------------------+

    Now we have a new DataFrame range_df with 4 rows, one for each of the species we indicated in the query above. But now the geometry field is a polygon that represents the observed range of that species in our dataset. Pretty neat – let’s visualize these species ranges using Felt.

    The Felt API supports file uploads in a variety of formats, but we’ll use GeoJSON. We’ll convert our Spatial DataFrame into a GeoPandas GeoDataFrame and then export to a GeoJSON file so we can upload it to the Felt API.

    range_gdf = geopandas.GeoDataFrame(range_df.toPandas(), geometry="geometry")
    range_gdf.to_file('birdbuddy_range.geojson', driver='GeoJSON')

    We’ve now created a GeoJSON file birdbuddy_range.geojson that looks a bit like this (we’ve omitted some lines):

    {
        "type": "FeatureCollection",
        "features": [
            {
                "type": "Feature",
                "properties": {
                    "common_name": "eastern bluebird",
                    "num": 65971
                },
                "geometry": {
                    "type": "Polygon",
                    "coordinates": [
                        [
                            [
                                -80.3452,
                                25.6062
                            ],
                            [
                                -98.2271,
                                26.2516
                            ],
                            ...
                            [
                                -80.3452,
                                25.6062
                            ]
                        ]
                    ]
                }
            },
            ...
        ]
    }

    Felt Maps API

    If you haven’t already, create a free Felt account and then in your account settings generate a new access token so you’ll be able to create maps and upload data via the Felt API.

    Creating a Felt API token

    FELT_TOKEN = '<YOUR_TOKEN_HERE>'

    To create a new map and upload data we’ll actually need to make a few network requests to the Felt API:

    1. /maps to create a new map. This endpoint will return the id and url of the new map.
    2. /maps/{map_id}/layers to create a new layer in our new map. Note we need to use the map_id from the previous request. This endpoint will return a presigned upload URL that will allow us to upload our GeoJSON file.
    3. /maps/{map_id}/layers/{layer_id}/finish_upload to indicate we have finished uploading our data using the presigned upload URL.

    The function below will create a new map in Felt, then create a new layer and upload our GeoJSON file to this layer. See the Felt API docs for more examples of what’s possible with the Felt API.

    def create_felt_map(access_token, filename, map_title, layer_name):
    
        # First create a new map using the /maps endpoint
        create_map_response = requests.post(
            f"https://felt.com/api/v1/maps",
            headers={
                "authorization": f"Bearer {access_token}",
                "content-type": "application/json",
            },
            json={"title": map_title},
        )
        create_map_data = create_map_response.json()
        map_id = create_map_data['data']['id']
        map_url = create_map_data['data']['attributes']['url']
        print(create_map_data)
    
        # Next, we'll create a new layer and get a presigned upload url so we can upload our GeoJSON file
        layer_response = requests.post(
        f"https://felt.com/api/v1/maps/{map_id}/layers",
        headers={
            "authorization": f"Bearer {access_token}",
            "content-type": "application/json",
        },
        json={"file_names": [filename], "name": layer_name},
        )
    
        # This endpoint will return a pre-signed URL that we use to upload the file to Felt
        presigned_upload = layer_response.json()
        url = presigned_upload["data"]["attributes"]["url"]
        presigned_attributes = presigned_upload["data"]["attributes"]["presigned_attributes"]
    
        # A 204 response indicates that the upload was successful
        with open(filename, "rb") as file_obj:
            output = requests.post(
                url,
                # Order is important, file should come at the end
                files={**presigned_attributes, "file": file_obj},
            )
        layer_id = presigned_upload['data']['attributes']['layer_id']
        print(output)
        print(layer_id)
        print(presigned_upload)
    
        # Finally, we call the /maps/:map_id/layers/:layer_id/finish_upload endpoint to complete the process
        finish_upload = requests.post(
            f"https://felt.com/api/v1/maps/{map_id}/layers/{layer_id}/finish_upload",
            headers={
                "authorization": f"Bearer {access_token}",
                "content-type": "application/json"},
                json={"filename": filename, "name": layer_name},
        )
        print(finish_upload.json())

    Now to create a new Felt map we can call this function, passing our API token, the name of our GeoJSON file as well as what we’d like to call our new map and the data layer.

    create_felt_map(FELT_TOKEN, "birdbuddy_range.geojson", "North American Bird Ranges", "My Favorite Birds")

    We’ve now created a new map in Felt and uploaded our GeoJSON data as a new layer. We can share the URL with anyone on the web to view or collaborate on our map!

    North American bird ranges

    We can also embed the map in our Jupyter notebook:

    from IPython.display import HTML
    HTML('<iframe width="1600" height="600" frameborder="0" title="My Favorite Bird Ranges" src="https://felt.com/embed/map/North-American-Bird-Ranges-a4c5cOCaRMiL64KK5N27TA"></iframe>"')

    The 30 Day Map Challenge

    Every year the geospatial data and map community joins together to organize the "30 Day Map Challenge" a fun and informal challenge to create a new map and share it on social media each day for one month.

    The 30 Day Map Challenge

    This BirdBuddy map was my 30 Day Map Challenge map for Day 3: Polygons. You can find the full Jupyter Notebook with all code on GitHub here as well as some of my other 30 Day Map Challenge maps in this repository. If you’d like to follow along with my attempt at the rest of the 30 Day Map Challenge feel free to connect with me on Twitter or LinkedIn.

    Want to keep up to date with Wherobots? Sign up for the Wherobots Developer Newsletter below: