Guest Column | February 7, 2020

Amazon Forecast: How To Get Started

By Rob Whelan, 2nd Watch

Strategies for Efficient Clinical Supply Management and Forecasting

How many times have you been asked to predict revenue for next month or next quarter? Do you mostly rely on your gut? Have you ever been asked to support your numbers? Cue sweaty palms frantically churning out spreadsheets.

Maybe you’ve suffered from the supply chain “bullwhip” effect: you order too much inventory, which makes your suppliers hustle, only to deliver a glut of product that you won’t need to replace for a long time, which makes your suppliers sit idle.

Wouldn’t it be nice to plan for your supply chain as tightly as Amazon.com does? With Amazon Forecast, you can do exactly that. In part one of this two-part article, I’ll provide an overview of the Amazon Forecast service and how to get started. Part two of the article will focus on best practices for using Amazon Forecast.

The Backstory

Amazon knows a thing or two about inventory planning, given its intense focus on operations. Over the years, it has used multiple algorithms for accurate forecasting. It even fine-tuned them to run in an optimized way on its cloud compute instances. Forecasting demand is important, if nothing else to get a “confidence interval” - a range where it’s fairly certain reality will fall, say, 80 percent of the time.

In true Amazon Web Services fashion, Amazon decided to provide its forecasting service for sale in Amazon Forecast, a managed service that takes your time series data in CSV format and spits out a forecast into the future. It gives you a customizable confidence interval that you can set to 95 percent, 90 percent, 80 percent, or whatever percentage you need. And, you can re-use and re-train the model with actuals as they come in.

When you use the service, you can tell it to run up to five different state-of-the-art algorithms and pick a winner. This saves you the time of deliberating over which algorithm to use.

The best part is that you can make the forecast more robust by adding in “related” time series - any data that you think is correlated to your forecast. For example, you might be predicting electricity demand based on macro scales such as season, but also on a micro level such as whether or not it rained that day.

 How To Use

Amazon Forecast is considered a serverless service. You don’t have to manage any compute instances to use it. Since it is serverless, you can create multiple scenarios simultaneously - up to three at once. There is no reason to do this in series; you can come up with three scenarios and fire them off all at once. Additionally, it is low-cost, so it is worth trying and experimenting with often. As is generally the case with AWS, you end up paying mostly for the underlying compute and storage, rather than any major premium for using the service. Like any other machine learning task, you have a huge advantage if you have invested in keeping your data orderly and accessible.

Here is a general workflow for using Amazon Forecast:

  1. Create a Dataset Group. This is just a logical container for all the datasets you’re going to use to create your predictor.
  2. Import your source datasets. A nice thing here is that Amazon Forecast facilitates the use of different “versions” of your datasets. As you go about feature engineering, you are bound to create different models that will be based on different underlying datasets. This is absolutely crucial for the process of experimentation and iteration.
  3. Create a predictor. This is another way of saying “create a trained model on your source data.”
  4. Create a forecast using the predictor. This is where you actually generate a forecast looking into the future.

To get started, stage your time series data in a CSV file in S3. You have to follow AWS’s naming convention for the column names. You also can optionally use your domain knowledge to enrich the data with “related time series.” Meaning, if you think external factors drive the forecast, you should add those data series, too. You can add multiple complementary time series.

When your datasets are staged, you create a Predictor. A Predictor is just a trained machine learning model. If you choose the “AutoML” option, Amazon will make up to five algorithms compete. It will save the results of all of the models that trained successfully (sometimes an algorithm clashes with the underlying data).

Finally, when your Predictor is done, you can generate a forecast, which will be stored on S3, which can be easily shared with your organization or with any Business Intelligence tool. It’s always a good idea to visualize the results to give them a reality check.

In part two of this article, we’ll dig into best practices for using Amazon Forecast.

About The Author

Rob Whelan is Practice Director, Data & Analytics, 2nd Watch.