Tutorial: Smoothing time-series data using moving averages

SQUEAC requires you to analyse a variety of time-series, such as admissions and exits over time. This type of data usually requires smoothing before being plotted using line charts.

You can smooth data using the charting functions in spreadsheet applications. If you do this, make sure that you use a smoothing function that is suited to time-series data. These will usually be called something like ‘moving average’ or ‘running average’.

Do not worry if your spreadsheet package does not provide moving average functions. It is easy to program these functions yourself. Figure 1 shows how to program a spreadsheet for three different types of moving average and the effect that these have on a time-series.

Figure 1: Programming a spreadsheet for three different moving averages with a span of three successive data points

Running medians-of-three (M3)

Running averages-of-three (A3)

Running medians-of-three followed by running averages-of-three (M3A3)

Moving averages can be applied several times. This involves applying a smoothing method to
previously smoothed data, as is done with the M3A3 smoother shown in Figure 1. With the
M3A3 smoother, the data are smoothed by taking the medians of sets of three successive data points
(M3). The results are then smoothed by taking the arithmetic means of sets of three successive
smoothed data points (A3). The more times you apply a moving average, the more smoothing is
applied to the data.

A time-series can be thought of as a combination of random (irregular or ‘noise’), seasonal, and trend components. Judicious application of smoothing techniques, such as moving averages, hides some of these components and helps uncover other components of the time-series:

  • Smoothing using moving averages of short spans (i.e., of just a few successive data points) will tend to hide the random ‘noise’ component and help reveal the seasonal and trend components of the time-series.
  • Smoothing using moving averages of longer spans (i.e., of enough data points to cover an entire seasonal cycle) will tend to hide both the random ‘noise’ and seasonal components and help reveal the trend component of the time-series.