Predicting U.K. Commodities Index

This assignment is interested in the Financial Times Commodity Index (FTCI) using data from Coen et al. (1969, Table 1). The FTCI data is inclusive for each quarter between the third quarted of 1952 until the end of 1967. The goal of this assignment is to come up with a method to predict FTCI’s value in coming quarters.

A time series plot of FTCI, The Economist magazine is shown in Figure 1. The plot includes the 62 time points from the data.

The first model examined is the Random walk hypothesis which suggests the optimal forcast is simply the last observed value Bachelier (1900) \[z_t = z_{t-1} + e_t, \quad (*)\]

In this model \(z_{t}\) is th FTCI at time \(\textit{t} \,\) and \(e_t\) is the random error of the model which is assumed independently distributed with mean 0 and constant variance.

According to the random walk model the first differences should be white noise. This is supported by the autocorrelation plot in Figure 2.

A prediction excersise is done using the FTCI value in the previous quarter to predict the FTCI in the next quarter.

The optimal forecasts for the random walk model are shown in Table below.

Table 1. Random Walk Forecasts for 1967(top), and a summary of RMSE for the entire data set(bottom)
	observed	forecast	RMSE
1967/1	93.74	96.21	2.5
1967/2	91.37	93.74	2.4
1967/3	86.31	91.37	5.1
1967/4	84.98	86.31	1.3

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.02    0.61    1.51    1.72    2.25    5.64

Next, it is of intrest to see if either U.K. car Production (UKCAR) or the Financial Times Ordinary Share Index (FTCI) are befinicial in improving the model This is done using a regression with autocorrelated error model. This model is used rather than a lagged linear regression model. A lagged linear regression model with previous FTCI, FTI and UKCAR as imputs would have both a lagged dependent and lagged independent variables. This is problematic as the assumption of independent error is not satisfied.

The model below is the regression with autocorrelated error examined

\[z_t = \beta_0 + \beta_1 x_{t-1}+\beta_2 y_{t-1} + n_t\],

where \(x_{t-1}\) is the UKCAR car production in the previous quarter, and \(y_{t-1}\) is the FTI in the previous quarter.

where \(n_t\) is autocorrelated noise that satisfies the random walk equation

\[n_t = n_{t-1} + e_t\],

where \(e_t\) is strong white noise (ie. IID mean zero and constant variance). It is more convenient to re-write this model as \(\nabla n_t = e_t\), where \(\nabla\) is the first difference operator. Now the regression model may be written,

\[\nabla z_t = \beta_0+ \beta_1 \nabla x_{t-1}+ \beta_2 \nabla y_{t-1}+ e_t . \quad\quad (1)\]

This model looks very similar in some ways to the previous regression model with the lagged dependent variable but this model satisfies the Gauss-Markov assumptions and if we assume the error term is normally distributed white noise,


	Dependent variable:

	dftci

dukcarLag1	-0.853
	(0.966)

dukftiLag1	0.050^***
	(0.018)

Constant	-0.349
	(0.286)


Observations	57
R²	0.126
Adjusted R²	0.093
Residual Std. Error	2.118 (df = 54)
F Statistic	3.887^** (df = 2; 54)

Note:	p<0.1; p<0.05; p<0.01

Table 2. Regression with Autocorrelated Error

From the summary statistics in table 4, it appears some of the variables are not significantly improving the model. A few additional models are then compared to this one.

\[\nabla z_t = \beta_0+ \beta_2 \nabla y_{t-1}+ e_t . \quad\quad (2)\] \[\nabla z_t = \beta_2 \nabla y_{t-1}+ e_t . \quad\quad (3)\] \[\nabla z_t = \beta_1 \nabla x_{t-1}+ \beta_2 \nabla y_{t-1}+ e_t . \quad\quad (4)\] \[\nabla z_t = \beta_0+ \beta_1 \nabla x_{t-1}+ e_t . \quad\quad (5)\] \[\nabla z_t = \beta_1\nabla x_{t-1}+ e_t . \quad\quad (6)\]

Table 3. Model selction
Model	Adjusted R square	AIC	BIC	PRESS
1	0.093	252	260	266
2	0.097	251	257	264
3	0.080	251	255	262
4	0.078	252	258	264
5	0.097	251	257	264
6	-0.018	256	261	286

While, all of the results are quite similar I would suggest that model 3 is the best fitting for this data. It has the lowest AIC and BIC values as well as the lowest PRESS statistic. Since the goal of this assignment is to predict future observations; having a low PRESS statistic is important. As, PRESS indicates the models predictive ability. Thus, it appears that using UK car production is not necessary in a model of Financial Times Commodity Index but using Financial Times Ordinary Share Index can be benificial.


	Dependent variable:

	dftci

dukftiLag1	0.040^**
	(0.016)


Observations	57
R²	0.096
Adjusted R²	0.080
Residual Std. Error	2.120 (df = 56)
F Statistic	5.950^** (df = 1; 56)

Note:	p<0.1; p<0.05; p<0.01

Table 4. Regression of a no intercept model with FTCI as a predictor and Autocorrelated Error

With time series regression the most important diagnostic check is for residual diagnostic checking. In practice, unless the time series lengths are very short, perhaps less than 20, then it suffices to examine at autocorrelation plot of the residuals.

References

George E. P. Box and Paul Newbold (1971).
Some Comments on a Paper of Coen, Gomme and Kendall. Journal of the Royal Statistical Society A 134/2, 229-240. http://www.jstor.org/stable/2343873. doi:10.2307/2343873.

Coen, P. J., Gomme, E. D and Kendall, M. G. (1969). Lagged Relationships in Economic Forecasting. Journal of the Royal Statistical Society A,
132/2 133-163. http://www.jstor.org/stable/2343782. doi:10.2307/2343782

O’Neil, Cathy and Rachel Schutt (2014). Doing Data Science. O’Reilly. https://books.google.ca/books?id=puj\_mAEACAAJ

Predicting U.K. Commodities Index

Jessica Weiss

January 30, 2017