Exploring the potential of a national bus GPS archive for transport planning applications in England
Main Article Content
Abstract
Introduction & Background
Over the past decade, the United Kingdom (UK) has experienced a substantial decline in public transport services. Civil organisations and the media have reported significant issues with bus reliability, such as cancellations, delays, and long waiting times due to low service frequency. These problems are often quantified through traditional methods, such as surveys, which can be costly and provide data only at intervals. In contrast, automatically collected data, such as that generated by real-time bus GPS tracking systems, offers an opportunity to gain deeper insights thanks to its more frequent and widespread coverage.
Objectives & Approach
Given the recognised need to support and enhance public transport in the UK, maximising the use of such digital footprint data could provide evidence for informed decision-making. However, archiving and understanding its quality for analytical purposes, as well as developing methods to harness its full potential, remain limited.
Relevance to Digital Footprints
This study explores a large-scale longitudinal record archive constructed from real-time bus location data for a representative day each week between October 2023 and April 2025 across all regions of England and Wales. The original data is sourced from the Bus Open Data Service, funded by the Department for Transport (DfT). The archive tracks over 25,000 buses, constituting the vast majority of the fleet in the area of study, and the archive amounts to an estimate of 1 billion records. The aim is to explore the spatial-temporal depth of the dataset and assess its quality.
The methods draw on geographical information system (GIS) and large-scale processing tools and techniques to develop representative speed measures and kilometres operated per inhabitant at various geographic scales for various times of the day nationally. This processing harness ‘duckdb’ local database system in combination with spatial ‘Uber H3’ indexing, which streamlines the process. The method also compares the observed number of services at each hour of the day with the planned services as a base line for a small area in Manchester over the total period of study.
Conclusions & Implications
The results prove value of GPS transport data as an input for empirical research for the public good, such as addressing questions related to employment, health, and education outcomes from a quasi-experimental perspective. Challenges included the processing and storage of large data volumes and the need to align bus trajectories with the road network from GPS points.
These findings are relevant for both the field of transport planning and policy. The findings build on prior transport planning research by refining and expanding methods to generate value from digital footprint data originally produced for operational purposes, applying it to address challenges in transport studies. In terms of policy, the insights from this study could pave the way for organisations to adopt cost-effective tools, aiding in the evaluation of interventions under the proposed public-ownership model envisioned by the central UK government.
