The Bus Station That Didn't Exist, and Other Data Epiphanies

https://news.ycombinator.com/rss Hits: 1
Summary

“Data is multidisciplinary” is my mantra—it’s 2025, and I’ve now worked 20 years in every possible flavour of data—data visualization, open data advocacy, data pipelines in healthcare, data-driven national-scale services, AI innovation, and more. Whatever the application or project, my take on data literacy is the fundamental ability to challenge your own assumptions about the data you have or don’t, the appropriateness in using it, the ethics of your application, and ask yourself: is there a different way, perhaps? Here is a gallery of some of my most treasured eureka moments working with data. You have a clear purpose but the data isn’t quite right for it I regularly walk through the Turnpike Lane Bus Station, there’s a pretty big sign pointing to it. It’s a major node for North London public transport and yet, a few years back, I found out that it did not exist… in the data, at least. I used to run the official data set of bus stops for the UK Government—a rather obscure dataset that made its way into powering a few popular journey planners like Google Maps and City Mapper. This was 2020 during COVID, and one of my colleagues wanted a list of all bus stations in the country in order to send posters which advertised social distancing. While the dataset contained over 500,000 points, it did not contain this bus station. The problem data definitions: the dataset listed bus stops, which were not the same things as bus stations. While the words “bus station” have a common sense meaning in our minds as a collection of bus stops, that meaning was not translated into the dataset. The individual bus stops making the bus station are all in the datasets, except there was no way to group them together other than trying to infer they’re part of the same bus station because of their proximity. I found other interesting issues in the dataset. Some were easy to spot, like bus stations in the middle of the North Sea. Other stations were a few meters away from their real location,...

First seen: 2025-08-07 22:25

Last seen: 2025-08-07 22:25