Time Series Analysis of Monthly Milk Production

A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. [Source: Wikipedia]

Time series analysis can be useful to identify and separate trends, cycles and seasonality of a dataset. In this blog I will illustrate how I analyse time series datasets in R and also show methods to forecast the data.

Let’s look at the Monthly Milk Production dataset from datamarket.com. The time series dataset measures pounds per cow as its unit per month from January 1962 to December 1975.

The dataset can be downloaded from this link. Once we download the CSV file and place it in the working directory, we can read the file using the following code.

> library(forecast)
> milk <- read.csv("monthly-milk-production-pounds-p.csv")

Now we can transform the input to a time series by giving the frequency and start month/year. We can also print out the time series as seen below.

> milk.ts <- ts(milk, frequency=12, start=c(1962,1))
> milk.ts
 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1962 561 640 656 727 697 640 599 568 577 553 582 600
1963 566 653 673 742 716 660 617 583 587 565 598 628
1964 618 688 705 770 736 678 639 604 611 594 634 658
1965 622 709 722 782 756 702 653 615 621 602 635 677
1966 635 736 755 811 798 735 697 661 667 645 688 713
1967 667 762 784 837 817 767 722 681 687 660 698 717
1968 696 775 796 858 826 783 740 701 706 677 711 734
1969 690 785 805 871 845 801 764 725 723 690 734 750
1970 707 807 824 886 859 819 783 740 747 711 751 804
1971 756 860 878 942 913 869 834 790 800 763 800 826
1972 799 890 900 961 935 894 855 809 810 766 805 821
1973 773 883 898 957 924 881 837 784 791 760 802 828
1974 778 889 902 969 947 908 867 815 812 773 813 834
1975 782 892 903 966 937 896 858 817 827 797 843

Let’s plot out the time series.

> plot.ts(milk.ts)

Screen Shot 2017-10-27 at 23.17.52

We can see that there is a seasonal variation in the time series within a year. Also it seems to be like an additive model as the seasonal fluctuations are roughly constant over time. If the seasonal fluctuations seem to increase in the level of time series then it can denote a multiplicative model (which is not our case here).

Let’s now decompose the series into its constituents which are a trend component, an irregular component and a seasonal component (if present). The irregular component is the remainder of the time series once decomposed.

> plot(decompose(milk.ts))

Screen Shot 2017-10-27 at 23.24.28.png

If we have a time series using an additive model with an increasing or decreasing trend and seasonality, we can use Holt-Winters exponential smoothing to make short-term forecasts. To fit a predictive model for the log of the monthly milk production we write the following code.

> logmilk.ts <- log(milk.ts)
> milk.ts.forecast <- HoltWinters(log(milk.ts))
> milk.ts.forecast

Holt-Winters exponential smoothing with trend and additive seasonal component.

Call:
HoltWinters(x = log(milk.ts))

Smoothing parameters:
 alpha: 0.587315
 beta : 0
 gamma: 1

Coefficients:
 [,1]
a 6.788338238
b 0.002087539
s1 -0.031338300
s2 -0.090489288
s3 0.043463485
s4 0.058024146
s5 0.127891508
s6 0.100852168
s7 0.057336548
s8 0.011032882
s9 -0.047972067
s10 -0.047584275
s11 -0.097105193
s12 -0.051371280

> plot(milk.ts.forecast)

As for simple exponential smoothing and Holt’s exponential smoothing, we can plot the original time series as a black line, with the forecasted values as a red line on top of that.

Screen Shot 2017-10-27 at 23.28.33.png

To make forecasts for future times not included in the original time series, we use the “forecast()” function in the “forecast” package.

> milk.ts.forecast2 <- forecast(milk.ts.forecast, h=48)
> plot(milk.ts.forecast2)

Screen Shot 2017-10-27 at 23.36.41

The forecasts are shown in the dark blue line and the grey areas are the confidence intervals.

References: http://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

https://www.statmethods.net/advstats/timeseries.html

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s