TABLE OF CONTENTS

    Contributors

    • James Willis

      James is a Senior Geospatial Software Engineer simplifying the experience of gaining insights and deriving value from spatial vector data.

    • Daniel Smith

      Daniel is a Sr. Solution Architect with 20 years in the Geospatial industry working on public and private sector projects. Daniel's role at Wherobots has him designing everything from demos, to tutorials and to customer solutions, as well as supporting pre and post sales customer success.

    Generating map tiles can be challenging and expensive, especially when dealing with large datasets like those from the Overture Maps Foundation (Overture). It’s challenging and expensive because typical solutions are not scalable or performant, which forces you to build workarounds that waste your time and are not economical. Wherobots addresses these challenges using WherobotsDB, a purpose-built, planetary scale compute engine with a native vector tile generator (VTiles) that produces tiles in an easy to work with, cloud-native file format (PMTiles).

    In this post, we’ll show you how Wherobots makes generating vector tiles from billions of features at a large scale, a breeze! We’ll prove this to you using a demo that generates a tileset for all 11 Overture layers in New York City, and we’ll repeat this at a larger scale for all the transportation segments across the state of New York.

    What Are Vector Tiles and PMTiles?

    Vector tiles are small chunks of map data that allow for efficient rendering at varying zoom levels. Unlike raster tiles which are pre-rendered images, vector tiles contain attributes and geometric data that facilitate dynamic styling of map features on the fly, offering more flexibility and interactivity.

    PMTiles is a cloud-native file format that is designed for holding an entire collection of tiles, in this case vector tiles. The PMTiles format allows individual tiles to be queried directly from cloud object storage like Amazon S3. By querying directly from cloud storage, you no longer need to set up and manage dedicated infrastructure, reducing your costs and time-to-tile-generation.

    The Challenge of Creating Tiles in a Distributed Environment

    Generating tiles from worldwide map datasets was always a challenge. You had to process billions of geometric features using solutions that are not purpose-built for this scale, resulting in

    • High compute costs from development and production runs as well as fixed infrastructure.
    • Frequent crashes forcing developers to babysit and troubleshoot the system.
    • Tiles that are out-of-date because tile generation is time consuming and difficult.
    • Tile generation taking days or hours, vs minutes or seconds to complete.
    Benefits of Using Wherobots for Tile Generation
    • Efficiency: Wherobots optimizes the tile generation process, making it faster and more cost efficient.
    • Scalability: WherobotsDB and VTiles are part of a purpose-built system designed to handle large datasets without compromising performance, providing a resilient solution for your geospatial needs.
    • Flexibility: VTiles accelerates time-to-production by allowing you to iterate faster using a more responsive, lower cost development experience.

    Wherobots for Highly Scalable Vector Tile Generation

    Wherobots VTiles, our new native vector tile generator, incorporates innovative algorithms for distributed tile generation on WherobotDB, a high performance spatial compute engine. VTiles is designed to generate vector tiles from small to planetary scale datasets quickly and cost-efficiently. Wherobots handles the heavy lifting and infrastructure management, ensuring the tile generation process is performant, scalable, and easy. We will prove this in the following demos.

    Demo: From NYC to NY State

    We’ll use Wherobots VTiles to generate PMTiles for all Overture layers in New York City. Then we will scale the demo up by generating PMTiles for all transportation segments in New York State, using feature filter optimizations.

    Step 1: Preparing the Data

    First, we need to load the administrative, places, transportation, base, and buildings layers (yes, all of them!) from the Overture data for New York City. The latest Overture dataset is included in the Wherobots Spatial Catalog out-of-the-box, which makes it easy to load all of these layers in a single statement.

    _aoi = "POLYGON ((-73.957901 40.885486, …, -73.957901 40.885486))"
    df_transportation_segment = sedona.sql(f"""
    SELECT
    ST_INTERSECTION(ST_GEOMFROMWKT("{_aoi}"),geometry),
    "transportation_segment" as layer
    FROM
    wherobots_open_data.overture_2024_02_15.transportation_segment t1
    WHERE ST_INTERSECTS(ST_GeomFromText("{_aoi}"),geometry)
    """)

    Here we apply a spatial filter (WHERE ST_INTERSECTS()) and a “clipping” function (ST_INTERSECTION()) to reduce our data to our area of interest, New York City. We also add a layer attribute containing our layer name. You can add or bring in additional attributes if needed. This “load-intersect” statement is executed for each of the 11 Overture Maps Foundation layers in Wherobots and the resulting DataFrames are added to a list for the next step in data prep.

    We want to cut a single PMTile file for all the layers in one go, and to accomplish this all the layers need to be unioned together.  Here we iterate through the list of layers and perform the union with the DataFrame API.

    nyc_unioned_data = df_admins_administrativeBoundary
    
    for t in tables_4_tiles[1:]:
        nyc_unioned_data=nyc_unioned_data.union(t.where(ST_Intersects(t.geometry ,\
        ST_GeomFromText(lit(_aoi)))).select \
        ("layer",ST_Intersection(ST_GeomFromText(lit(_aoi)), t.geometry).alias("geometry")))
    
    nyc_unioned_data = nyc_unioned_data.where(f'ST_IsEmpty(geometry) = False')

    Step 2: Generate the Tiles

    With our data prepared we are ready to kick off the tile generation process. This process looks like this:

    # Generate the tiles
    nyc_tiles_df = vtiles.generate(nyc_unioned_data)
    
    #Define storage endpoint
    nyc_full_tiles_path = os.getenv("USER_S3_PATH") + "nyc_tiles.pmtiles"
    
    #Define a vtile builder specifying the data for automatic schema discovery
    builder = vtiles.PMTilesConfigBuilder().from_features_data_frame(nyc_unioned_data)****
    
    #Define the order in which to layer the tiles
    nyc_ordered_layers= [ buildings, places ..., base]
    builder.layers = nyc_ordered_layers
    
    #BUILD ALL THE TILES
    ordered_config= builder.build()
    
    #Write the tiles to storage
    vtiles.write_pmtiles(nyc_tiles_df, nyc_full_tiles_path, ordered_config)

    This process took 79 seconds and generated a 74.7 MiB file. You can visualize the results in Wherobots by calling vtiles.show_pmtiles(nyc_full_tiles_path)

    Wherobots VTiles breaks the dataset down into manageable chunks, processing each layer efficiently. The result is a single PMTile dataset that can be easily visualized and styled and served directly from S3.

    Step 3: Expanding to NY State Transportation

    Next, we scale the process up to cover all transportation segments in New York State. This involves a larger dataset, but WherobotsDB handles it seamlessly. The process is identical as above but this time we want to utilize zoom based feature filtering (do we really need to see driveways when viewing at the state scale?) in the vtiles.GenerationConfig() .

    gen_config = vtiles.GenerationConfig(
        # Minimum zoom level for generation
        min_zoom=4,
        # Maximum zoom level for generation
        max_zoom=16,
        feature_filter = (
            when(
            # Only add motorway and trunk features to tiles level 8 and above
                (col("class").isin(["motorway", "trunk"])) & (col("tile.z") < 8), False)
            .when(…)
            …
            ).otherwise(True)  # Default to rendering a feature
        )
    )

    We pass our configuration into VTiles: df_transportation_segment_tiles = vtiles.generate(df_transportation_segment,gen_config) and use the same vtiles.write_pmtiles() function as above. PMTile generation for NY state transportation segments completed in ~2.5 minutes.

    Wrapping Up

    Previously, generating vector tiles from large datasets, like those from the Overture Maps Foundation, was challenging. You now have a solution to these challenges. Wherobots makes tile generation at any scale, easy, reliable, and cost effective. We simplify the process by unlocking distributed tile generation, and we make it possible to visualize and share large-scale geospatial data efficiently. Whether you’re focusing on a single city like New York City, an entire state, or the entire planet, Wherobots provides tile generation capability that’s purpose-built for any scale.

    Ready to dive into vector tile generation with Wherobots? Start by signing up for a free Wherobots account, walk through this tutorial, connect Wherobots to your data in S3, and enjoy the benefits of efficient and flexible map visualization. Check out the WherobotsDB VTiles tutorial and reference documentation for more information

    Want More Tiles News at Wherobots?

    1. The tiles we generate in this blog are available for free via AWS S3
      1. New York City is here
      2. New York State is here
    2. We are working on making a variety of global Overture PMTiles available for free via Amazon S3. We will document the process for and performance of preparing these tiles in a future publication.
    3. We are working on launching a tile viewer to help folks visualize and inspect the contents of PMTiles files.

    Want to keep up with the latest developer news from the Wherobots and Apache Sedona community? Sign up for the This Month In Wherobots Newsletter:

    Contributors

    • James Willis

      James is a Senior Geospatial Software Engineer simplifying the experience of gaining insights and deriving value from spatial vector data.

    • Daniel Smith

      Daniel is a Sr. Solution Architect with 20 years in the Geospatial industry working on public and private sector projects. Daniel's role at Wherobots has him designing everything from demos, to tutorials and to customer solutions, as well as supporting pre and post sales customer success.