3.7 Exercises

For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance.
- usnetelec
- usgdp
- mcopper
- enplanements
Why is a Box-Cox transformation unhelpful for the cangas data?
What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
Calculate the residuals from a seasonal naïve forecast applied to the quarterly Australian beer production data from 1992. The following code will help.
```
beer <- window(ausbeer, start=1992)
fc <- snaive(beer)
autoplot(fc)
res <- residuals(fc)
autoplot(res)
```
Test if the residuals are white noise and normally distributed.
```
checkresiduals(fc)
```
What do you conclude?
Repeat the exercise for the WWWusage and bricksq data. Use whichever of naive or snaive is more appropriate in each case.
Are the following statements true or false? Explain your answer.
1. Good forecast methods should have normally distributed residuals.
2. A model with small residuals will give good forecasts.
3. The best measure of forecast accuracy is MAPE.
4. If your model doesn’t forecast well, you should make it more complicated.
5. Always choose the model with the best forecast accuracy as measured on the test set.
For your retail time series (from Exercise 3 in Section 2.10):
1. Split the data into two parts using
```
x1 <- window(myts, end=c(2010,12))
x2 <- window(myts, start=2011)
```
2. Check that your data have been split appropriately by producing the following plot.
```
autoplot(cbind(x1,x2))
```
3. Calculate forecasts using snaive applied to x1.
4. Compare the accuracy of your forecasts against the actual values stored in x2.
```
f1 <- snaive(x1)
accuracy(f1,x2)
```
5. Check the residuals.
```
checkresiduals(f1)
```
  Do the residuals appear to be uncorrelated and normally distributed?
6. How sensitive are the accuracy measures to the training/test split?
vn contains quarterly visitor nights (in millions) from 1998-2015 for eight regions of Australia.
1. Use window() to create three training sets for vn[,"Melbourne"], omitting the last 1, 2 and 3 years; call these train1, train2, and train3, respectively. For example train1 <- window(vn[, "Melbourne"], end = c(2014, 4)).
2. Compute one year of forecasts for each training set using the snaive() method. Call these fc1, fc2 and fc3, respectively.
3. Use accuracy() to compare the MAPE over the three test sets. Comment on these.
Use the Dow Jones index (data set dowjones) to do the following:
1. Produce a time plot of the series.
2. Produce forecasts using the drift method and plot them.
3. Show that the forecasts are identical to extending the line drawn between the first and last observations.
4. Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?
Consider the daily closing IBM stock prices (data set ibmclose).
1. Produce some plots of the data in order to become familiar with it.
2. Split the data into a training set of 300 observations and a test set of 69 observations.
3. Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
4. Check the residuals of your preferred method. Do they resemble white noise?
Consider the sales of new one-family houses in the USA, Jan 1973 – Nov 1995 (data set hsales).

a. Produce some plots of the data in order to become familiar with it.
b.  Split the `hsales` data set into a training set and a test set, where the test set is the last two years of data.
c.  Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
d. Check the residuals of your preferred method. Do they resemble white noise?