Your first steps to reliable data for Public Transport.

When you’re starting your journey to become data-driven, you naturally have some questions that need good answers. You have come to the right place. Here we will help you on your way to your first reports.

The first level of data-driven is to have available and reliable data, so you can make better-informed decisions. This is the foundation to build upon.

Stairway to sustainable business value, in this blogpost we are looking at the first level.

Top management wants overarching information. Route planners want information on timetables, lines, and stops levels. Customer support needs to give up-to-date answers to incoming questions. Your customers might want to know something like why they had to wait fifteen minutes in the rain for a late bus, and then stand crammed like sardines with their face in another person’s armpit for the next half an hour. When customer feedback and other collected data show you that things like this happen a little too often, you can start analyzing the data to find out the reasons and solve the situation.

From an analytics and reporting perspective, these are the three top areas to dive into when getting started with a data-driven approach:

  • Punctuality
  • Travel numbers
  • Sales

You will need one analysis application for each area.

Dashboard for Travels and Punctuality for top management.

1. Punctuality

On-time departures and arrivals is the secret sauce that makes a public transport system trustworthy and attractive. Without it you might just as well walk, bike, or drive.

To make the best possible decisions for punctuality, you want unbiased facts from data. To make the best use of your collected travel data, you must make clear definitions. Some of these could be:

  • Time range – inside what time range is the transport considered to be punctual? Maybe somewhere between -1 minute to +3 minutes, but it will depend on the local traffic situation.
  • Length, beginning and end – is the trip ten stops, or twenty? The longer the trip, the harder it might get to stay on time. Maybe you need more air in your timetable.
  • Delay – how long is a normal delay? Five minutes? How long is an extreme delay? Fifteen minutes? Thirty? An hour? Again, it depends on the local situation – traffic network, road capacity, roadwork, special events, weather disturbance, and more.

Once you have made your definitions, you can start to measure and analyze your punctuality. With modern digital systems, you can do this down to line level and even zoom in on an individual vehicle, from start to finish. You will be able to see where deviation occurs, and collect data to help you figure out why, and what to do about it. Your collected data will eventually help you to make important hands-on decisions like these:

  • At what point do we add extra capacity to a bus line?
  • At what point do we change the timetable?
  • At what point do we rebuild or move stops?

Decisions like these can have great positive impact in people’s everyday lives and make public transport even more attractive.

2. Travel numbers

In Region Skåne, the southernmost part of Sweden, people take 170 million trips by public transport every year. Based on the region’s population, that’s about 120 trips per person and year. If we zoom in on the most likely commuters, age 15-65, the number of trips per person gets even higher.

These are good numbers but naturally, everybody in public transport wants the numbers to rise more, and fast. Not least for sustainability and the greater good. The clock is ticking.

Defining a trip

To measure success, you first need to define what a trip is.

Let’s say you walk to the bus stop in the morning, and then go to three stops. Then you leave the bus and change to the subway and go to two more stops. Then you walk the last part to the office. In real life, that’s one trip from home to work.

But from a technical standpoint, it’s two public transport legs: Bus plus subway. So, it counts as two trips. And then there’s the return journey. The average commuter travels to work 22 days a month, so with return trips, it sums up to 44. With one change of vehicles in each direction, it doubles to 88. This is how the math works.

So how to measure travel numbers? You can check the tickets digitally when boarding. You can have sensors in the back doors. You can put those two numbers together and detect fraud.

What kind of data is needed?

  • Yesterday’s data: Updated enough to help your customer support give good answers to incoming questions.
  • Weekly data: Adequate to monitor sales.
  • Live data and predictive analysis: This come later in your data-driven journey but is well worth mentioning:

For example, it can give commuters a heads-up on how crowded the next couple of buses are. Thus, it can help people keep their physical distance during a pandemic. Conversely, a girl traveling alone late at night might feel uncomfortable boarding an empty vehicle. It can also do good when optimizing passenger flow during rush hour.

3. Sales

When it comes to sales, there’s a myriad of questions to ask and just as many answers to get:

  • How many tickets did we sell during period X?
  • What types of tickets were they?
  • Were they bought by persons or municipalities (like tickets for commuting students)?
  • Where were they sold? At the station as a season ticket? At the tourist office as a weekend pass? In the mobile app? In an automat for paper tickets?
  • How were they paid for? Digitally? Physical card? Cash?
  • How many trips were made with each ticket?
  • Can we measure it against last year?
  • Do we see big fluctuations or stable trends?
  • Do we lose on vehicle cost, but win on customer numbers and customer satisfaction? When are we the most satisfied?
  • Will we reach our goals? Are we on the right track?
  • Can we draw wise conclusions from the data we collect and analyze

Every question above is a reason to establish a data-driven foundation to build upon. Fact-based insights lead to good decisions and good business results.

Examples of common analytics and reporting needs.

Data collection and visualization

The questions you’ve asked yourself during this process, for example, “what counts as one trip” usually get iterated a lot. Especially when you start realizing what data you have, what data you can get, and how we can help you treat it to show the numbers to an end-user.

Collecting data is a big step in the process. You need a few basic data sources for the three focus areas.


  1. A timetable, to know what the desired time was
  2. A data source from the vehicle to know where it was and when
  3. A way to connect the two sources above to know which line, what day the vehicle was supposed to run
  4. Stops, GPS-points, and street addresses

The aggregation level of the first and second points will decide at what level you can analyze your data. For example, if the bus only logs data at the first and last stop of the trip, you will only get the punctuality for those two parts of the trip. The rest will stay unknown, and you will miss out on possible delays and catch-ups.

If you want to analyze your data on stop level, you will need data either on GPS-level or an identifier for the stop itself.

Travel numbers:

  1. A data source for the amount of people who travelled with a vehicle
  2. A data set to connect what legs of travel the traveler took
  3. Ticket information for the traveler

Again, the aggregation level will decide at what level you can analyze your data. The second point is needed to be able to separate different lines and times of the day but is not necessary to see the total numbers over time. The third point is to know more about the travelers’ behavior and who travels when.

Travels per hour/day. Here one can clearly see that people tend to travel a bit earlier on Thursdays and Fridays during this period. The question one should ask oneself if this behavior continues is “should the timetable look the same for all weekdays?


  1. An ERP system or other data source to track all sales
  2. Data sources from the different sales channels
  3. Ticket information

When you have all these data sources and keys to how they connect you can start building some reports to visualize the data for the end-user.

The first reports typically consist of simple tables and charts with different detail levels for different end-users. Later, you can advance and plot data on maps to see where people travel from and to, and where delays occur. In the long run, your possibilities are endless.

At a high aggregation level of data, you can now analyze how many trips were on time, how many people traveled, and how many tickets were sold. With a lower aggregation level, you can now also get the full picture down to a single journey. You can answer questions like: Where did the traveler get on the bus? What type of ticket did they have? Where did they buy the ticket and at what price?


Data-driven analysis provides you with real insights on when, where and how people travel, how many vehicle changes they make, if the vehicle was on time, if the trip was properly paid for and how, if external circumstances (roadwork, trees fallen on the tracks, and so on) disturbs the traffic rhythm and customer satisfaction.

Data-driven analysis also gives you the opportunity to better guide people to select the best route alternative within the local or regional public transport system.

It’s the kind of knowledge that puts you in the driver’s seat.

Want to know more?

At Stratiteq we have vast experience in creating data-driven solutions for public transport. Please don’t hesitate to reach out to me or my colleague Gustav.

Gustav Hallberg
070 145 8068