29 May 2025 – YoungShip Liverpool had the pleasure to listen to Dr. Cattrell from Lloyd’s Register, hosted by the Royal Institution of Naval Architects (RINA), looking at the use of hexagonal spatial indexing (specifically, Uber’s H3 library) and unsupervised machine learning techniques to analyse maritime location data, particularly AIS (Automatic Identification System) data. The presentation explores the technical rationale for using hexagons in spatial analysis, the implementation of clustering algorithms for detecting maritime features, and the downstream business and policy applications of these technologies.

Key Points Covered:

Why Hexagons?

Hexagons are preferred for spatial tiling because they tile space efficiently, have consistent neighbour relationships, and approximate circles better than squares or triangles. This makes distance calculations more uniform and reduces spatial errors when covering irregular polygons.

H3 Library Overview:

Uber’s open-source H3 library is highlighted as a fast, flexible tool for spatial indexing, with bindings for various programming languages. H3 represents spatial tiles as 64-bit integers and provides functions for converting between geographic coordinates and H3 cells, finding neighbours, calculating distances, and covering polygons with hexagonal cells.

Clustering with DBSCAN:

The DBSCAN algorithm is used for unsupervised clustering of AIS points to detect features like berths and anchorages. DBSCAN is chosen because it does not require specifying the number of clusters, can find arbitrarily-shaped clusters, and is robust to noise—important for noisy maritime data. However, it struggles with varying data densities and large datasets.

Combining H3 and Clustering:

To address DBSCAN’s limitations at scale, the workflow partitions the data using H3 cells, constructs a graph of connected cells, and applies connected components analysis. This breaks the global clustering problem into smaller, more manageable subproblems, which can be processed in parallel using distributed computing frameworks like Ray and Spark.

Business and Policy Applications:

The resulting data enables detailed analytics on vessel movements, berth occupancy, waiting times, and anchorage usage worldwide. These insights support advisory work for ship owners, operators, charterers, and policymakers, including decarbonization strategies and planning for alternative fuel infrastructure (e.g., ammonia, methanol bunkering).

Integration with AI and LLMs:

The clean, structured dataset allows for the deployment of large language models (LLMs) and AI agents, enabling natural language querying of maritime analytics without requiring technical skills. This democratises access to complex analytics across business roles.