Fitting MCP Regressions

Prostate data

The first step is to form the XyList, the list containing the 6 components with the training/test data. We use the split suggested by HTF.

data(prostate, package="ElemStatLearn")
indTr <- prostate[,10] #indicator Train/Test
Xy <- prostate[,1:9]
n <- nrow(Xy)
p <- ncol(Xy)-1
XyTr <- Xy[indTr, ] #select training data
XTr <- as.matrix.data.frame(XyTr[,1:p]) #need matrix for glmnet
yTr <- XyTr[,p+1] #as vector
XyTe <- Xy[!indTr, ] #select test data
XTe <- as.matrix.data.frame(XyTe[,1:p]) #need matrix for glmnet
yTe <- XyTe[,p+1] #as vector
XyList <- list(XyTr=XyTr, XTr=XTr, yTr=yTr, XyTe=XyTe, XTe=XTe, yTe=yTe)

MCP Model Fitting

The first code snippet fits the MCP-penalized regression and plots the regularization path.

ans <- grpreg(XTr, yTr, penalty="grMCP")
plot(ans)

The grpreg package also offers simple K-fold cross-validation. It has seed as an argument, so if you don’t set it, you always get the same answer! Very clever but it does really avoid the problem that the underlying code still depends on the initial random seed. The vertical line indicates the minimum CV error. And examining Figure 1 we see that indeed the choice of lambda is random. For this reason, I prefer using BIC or using CV averaging.

Figure 1. 10-fold CV

someSeeds <- c(77665533, 881155, 1231312, 44455511)
layout(matrix(1:4, ncol=2))
for (i in 1:4) {
 plot(cv.grpreg(XTr, yTr, seed=someSeeds[i]))
};layout(1)

Rather than using cross-validation, the grpreg package offers AIC/BIC selection for the tuning parameter lambda. We select the best BIC setting. The MCP regression coefficients are shown below.

best <- select(ans,"BIC")
best$beta

## (Intercept)      lcavol     lweight         age        lbph         svi 
##  -0.7500728   0.5690885   0.6561632   0.0000000   0.0417446   0.3243882 
##         lcp     gleason       pgg45 
##   0.0000000   0.0000000   0.0000000

Next the predictions are computed and a Bland-Altman plot is produced.

yh <- predict(ans, XTe, type="response", lambda=best$lambda)
x <- (yh+yTe)/2
y <- yh-yTe
plot(x, y, xlab="Mean of predictions and test values",
     ylab="Difference: prediction-test", pch=18, col="blue")
abline(v=0, col="red")
ind <- order(x)
x <- x[ind]
y <- y[ind]
ans <- loess(y~x, span=0.9, degree=1) #high smoothing for trend detection
lines(x, fitted(ans), col=rgb(1,0,1,0.7), lwd=2.5)
title(main="Figure 2. Bland-Altman/Tukey Plot\nForecasts vs Observed")

The RMSE on the test data is 0.6613.

You can use the function grpregPredict to compare predictions of Lasso, SCAD and MCP fitted using BIC selection for lambda.

bestglm::grpregPredict(XyList=XyList)

##     LASSO      SCAD       MCP 
## 0.6730472 0.6843042 0.6613436

Fitting MCP Regressions

Ian McLeod

February 14, 2017

Purpose

Prostate data

MCP Model Fitting

Figure 1. 10-fold CV