3.7 Exercises
For the following series, find an appropriate Box-Cox transformation in order to stabilize the variance.
usnetelec
usgdp
mcopper
enplanements
Why is a Box-Cox transformation unhelpful for the
cangas
data?What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?
Calculate the residuals from a seasonal naïve forecast applied to the quarterly Australian beer production data from 1992. The following code will help.
beer <- window(ausbeer, start=1992) fc <- snaive(beer) autoplot(fc) res <- residuals(fc) autoplot(res)
Test if the residuals are white noise and normally distributed.
checkresiduals(fc)
What do you conclude?
Repeat the exercise for the
WWWusage
andbricksq
data. Use whichever ofnaive
orsnaive
is more appropriate in each case.Are the following statements true or false? Explain your answer.
- Good forecast methods should have normally distributed residuals.
- A model with small residuals will give good forecasts.
- The best measure of forecast accuracy is MAPE.
- If your model doesn’t forecast well, you should make it more complicated.
- Always choose the model with the best forecast accuracy as measured on the test set.
For your retail time series (from Exercise 3 in Section 2.10):
Split the data into two parts using
x1 <- window(myts, end=c(2010,12)) x2 <- window(myts, start=2011)
Check that your data have been split appropriately by producing the following plot.
autoplot(cbind(x1,x2))
Calculate forecasts using
snaive
applied tox1
.Compare the accuracy of your forecasts against the actual values stored in
x2
.f1 <- snaive(x1) accuracy(f1,x2)
Check the residuals.
checkresiduals(f1)
Do the residuals appear to be uncorrelated and normally distributed?
How sensitive are the accuracy measures to the training/test split?
vn
contains quarterly visitor nights (in millions) from 1998-2015 for eight regions of Australia.Use
window()
to create three training sets forvn[,"Melbourne"],
omitting the last 1, 2 and 3 years; call these train1, train2, and train3, respectively. For exampletrain1 <- window(vn[, "Melbourne"], end = c(2014, 4))
.Compute one year of forecasts for each training set using the
snaive()
method. Call thesefc1
,fc2
andfc3
, respectively.Use
accuracy()
to compare the MAPE over the three test sets. Comment on these.
Use the Dow Jones index (data set
dowjones
) to do the following:- Produce a time plot of the series.
- Produce forecasts using the drift method and plot them.
- Show that the forecasts are identical to extending the line drawn between the first and last observations.
- Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?
Consider the daily closing IBM stock prices (data set
ibmclose
).- Produce some plots of the data in order to become familiar with it.
- Split the data into a training set of 300 observations and a test set of 69 observations.
- Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
- Check the residuals of your preferred method. Do they resemble white noise?
Consider the sales of new one-family houses in the USA, Jan 1973 – Nov 1995 (data set
hsales
).
a. Produce some plots of the data in order to become familiar with it.
b. Split the `hsales` data set into a training set and a test set, where the test set is the last two years of data.
c. Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
d. Check the residuals of your preferred method. Do they resemble white noise?