Predict faulty facilities at German train stations with Amazon SageMaker Autopilot

Olalekan Elesin
4 min readOct 23, 2021

Predictive Maintenance System diagram, depicting the flow of data from open data sources, Deutsche Bahn and MetaWeather, to an email notification service.

Faulty elevators and escalators at train stations can be somewhat annoying, most especially when you are exhausted from the day’s work and lacking strength to climb the stairs. I would assume the same experience for other commuters at train stations in Germany. This experience can also be very dissatisfying for expecting mothers, families with kinderwagen, i.e. strollers. This pain point motivated me to do some digging, formulating the hypothesis:

Would it be possible to predict when escalators and elevators at train stations might likely breakdown? So that technicians are assigned for maintenance checks and likely improve commuter satisfaction?

With the above hypothesis, I did some search and found that the Deutsche Bahn Developer Portal, providing access to data from Deutsche Bahn via a collection of APIs. To my surprise, the developer portal is pretty matured. This was because I had a wrong assumption of DB as a legacy company.

Deutsche Bahn Developer Portal Available APIs

Deutsche Bahn Developer Portal Available APIs

The Data

For my use case, I made use of 3 data sources, they are:

  1. FaSta-Station_Facilities_Status — v2: A RESTful webservice to retrieve data about the operational state of public elevators and escalators in german railway stations.
  2. StaDa-Station_Data — v2: An API providing master data for German railway stations by DB Station&Service AG.
  3. Weather Data from MetaWeather.com

With the data sources in place, I needed to figure out the technical approach to solve the problem.

Technical Implementation

The steps below provide a high-level overview of the technical implementation of the predictive maintenance service:

  1. An AWS Glue Job runs every hour to pull data from the FaSta-Station_Facilities_Status API.
  2. Once the facilities data import is successful, there is another function in the Glue Job that calls the MetaWeather API to get the weather condition of each train station.
  3. The weather information is then joined with facilities data and the master data for all German railway stations by DB Station&Service AG. And written to an Amazon S3 bucket.
  4. For downstream analysis, an AWS Glue crawler crawls the data location and updates the AWS Glue Data Catalog which is then queried via Amazon Athena. Dashboards are also created in Amazon Quicksight. (Imagine the potential of Amazon Quicksight Q)
  5. For predictive maintenance, we use Amazon SageMaker Autopilot to train up to 250 models and automatically select the best model based on the Mean Squared Error (MSE). We defined the problem as a regression task i.e. predict the number of hours before an equipment becomes inactive.
  6. Once we a best candidate model from Amazon SageMaker Autopilot, we run a SageMaker Batch Transform job on new data. This is then saved to S3.
  7. The final goal is then to notify train station operators, information available in the stations master data, via email with Amazon Simple Email Service.

All the AWS services used in this architecture are serverless with automatic scaling. These AWS services also have pay-as-you-go pricing, so the cost scales predictably with the amount of data ingested, processed, and predicted.

Conclusions

This simple architecture demonstrates how easy it is to build production-ready machine learning systems on AWS. Using Amazon SageMaker Autopilot, I did not have to worry about feature extraction or feature engineering but only focusing on the problem I wanted to solve with machine learning.

With Amazon SageMaker Autopilot and Amazon SageMaker Studio, I understand how the machine learning model by portraying the importance of its features in terms of SHAP values. This also allows me to present the decisions made the ML model to non-ML audience.

Sample Explainability Report from Amazon SageMaker Studio

Can’t wait to hear what you’ll build with Amazon SageMaker Autopilot. You can reach me via email, follow me on Twitter or connect with me on LinkedIn.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Olalekan Elesin
Olalekan Elesin

Written by Olalekan Elesin

Enterprise technologist with experience across technical leadership, architecture, cloud, machine learning, big-data and other cool stuff.

No responses yet

Write a response