After I had to regularly travel certain routes per bus or train, I had to endure one delay or two. Because of my statistical background I came upon the idea to record the delays and to find out how punctual those means of transport really are. After a long time in mid-January I started again with it and now want to give an indication what you can make of it.
As an example I took the bus stop “Kuckelkorn” in the city of Aachen which is in my neighbourhood. There I recorded for 6 busses when they’ve departed. The busses are the following, driving each for the transport system of Aachen, the AVV:
- Bus 35 to Breinig Entengasse in Stolberg (into the city). Departure: 6:26 pm.
- Bus 55 to Vaalserquartier (out of the city). Departure: 6:28 pm.
- Bus 3A to Uniklinik (into the city). Departure: 6:28 pm.
- Bus 3B to Uniklinik (out of the city). Departure: 6:28 pm.
- Bus 45 to Aachen-Brand (into the city). Departure: 6:36 pm.
- Bus 45 to Uniklinik (out of the city). Departure: 6:36 pm.
I started the measurements on the 14th of January. Those went to the 24th of April, which provided – with the exception of Carnival Monday and Easter – 72 working days. On the 20th of March I could not be there, so I could measure on 71 days. For bus 35 on the 14th of January and bus 3A on the 28th of January there each has been a missing value.
At first the are some measures for a quick view. Because occasionally some big delays took place, robust measures are appropriate:
- The median: When you sort the data, then the value in the middle is the median. It is relatively invulnerable against extreme outliers.
- The lower and upper quartile: . The quartiles divide the sorted data into four parts of equal size. The first quartile (Q1) cuts off the lowest 25% of the data. The third quartile (Q3) cuts off the upper 25% of the data.
- The interquartile range (IQR): The IQR is the difference of the third and first quartiles and acts as a measure of dispersion. It is also called the midspread.
Delays at the bus stop “Kuckelkorn” | ||||||
---|---|---|---|---|---|---|
Bus | 35 | 55 | 3A | 3B | 45 | 45 |
Departure | 6:26 pm | 6:28 pm | 6:28 pm | 6:28 pm | 6:36 pm | 6:36 pm |
1. quartile | -0:28,0 | 1:52,0 | 0:28,0 | 0:12,0 | -1:09,0 | 0:29,0 |
Median | -0:13,5 | 4:56,0 | 0:36,0 | 1:00,0 | 0:03,0 | 1:12,0 |
3. quartil | 1:04,3 | 7:27,0 | 1:54,0 | 3:28,0 | 0:50,0 | 2:46,0 |
IQR | 1:32,3 | 5:35,0 | 1:26,0 | 3:16,0 | 1:59,0 | 2:17,0 |
(times in minutes and seconds)
(Example: The delays of bus 55 had a median of 4 minutes and 56 seconds.)
Among the busses examined bus 55 stands out, which drives from the border to Belgium at Aachen-Lichtenbusch via Kornelimünster and the inner city and is departing later than the other busses. The dispersion of the departures is bigger too.
Besides the busses which has to pass through the city seem to depart later, which might be because of the advanced position on their routes. To prove this more busses had to be examined.
A further examination of the delays requires observing them over the time. The aforementioned measures are assuming that the circumstances remained the same. But weather, school holidays or similar thing may have an influence. This way you also can find out if some busses has been relatively late (or punctual!). This would indicate special causes or events, which have to be examined.
A possible advantage of recording delays is that one bus may turn out to be punctual. On the other hand I’ve experienced much more unpunctual busses. When I was working in Baesweiler, bus 51 coming from Aachen, for which I had to wait at the endpoint at the Reyplatz, has been generally late.
However, with the trains of the Deutsche Bahn I’ve experienced more. Out of the trains which I took at the weekend – departing in Bremen and supposed to arrive at 8:48 pm in Cologne – around autumn every fourth arrived more than 45 minutes late, effectively making me miss the connecting train to Aachen.