Pivotal Data Science Transport Demo

This demo predicts the duration of unexpected road traffic incidents in London. Find out more on the Predictions, Analysis, and Technology pages.

Model Building

To predict the likely duration of unexpected incidents which are currently ongoing it is necessary to build a model based on incidents which have already ended. Attributes of these past incidents, called features, are used by the model algorithm to estimate the behaviour of the current incidents.

The features used in this model come directly from the data present in the traffic reports or are derived from this data. To capture information about the weather conditions at the time the incident began we use weather reports from London airports within an hour of the start time of the incident. We only predict durations for "active" incidents, which we class as those with an associated disruption report in the last six hours. This avoids issues with old incidents for which a final end time was never received.

Some features that might be useful in a predictive model are


For any predictive analytics problem there are a wide-range of prospective models available. When choosing between these, it is necessary to consider many factors including the accuracy of predictions, speed of execution, and ease of understanding of the result.

For this problem of predicting incident durations we have chosen three different models to compare. These models are relatively simple and reflect an initial attempt to capture the behaviour in the system. A full project on this type of problem would involve numerous iterations on something like these initial models leading to a more complex but hopefully more accurate prediction model.

The three models are:

In all these models we have applied a final step of ruling out any prediction which conflicts with another feature, the currently elapsed duration of the incident. In this way we set to zero the probability that an incident which is 3 hours old should last only 2 hours. Note that when scoring the models as below, we assume that each incident has just started and do not apply this final step.

Scoring Models

In building this demo we have considered the three very different models above, and evaluated these based on how many times they correctly predict the duration of a set of test incidents within certain bounds. In technical terms, we have used 10-fold cross-validation in scoring these models, and varied the acceptable bound on a prediction from 0 to 5 hours in increments of 0.1 hours.

The results of this scoring can be seen in the chart below. None of the models displays exceptional accuracy, but there are clear differences in the performance. The worst performing model is the simple linear model over a small number of features. Next come the two variants of Random Forests with little performance difference between them.

The best performing model is in some ways the simplest. The MAP model is consistently better than the other two models and delivers relatively good performance even with small error bounds of around 30 minutes (0.5 hours).

Powered by:

Created by Ian Huston | Twitter | LinkedIn | Website

Thank you to the whole Data Science team for their help in producing this demo, especially Noelle and Vatsan.