Improved Spread-Location Visualization
McLeod (1999) suggested a further improvement to the standard diagnostic check of plotting residuals and fits. As pointed out by Cleveland (1978) a better method than just plotting the residual vs. the fit, is to plot the absolute residuals vs. fits along with a robust smooth. Then an apparent slope indicates monotone spread. In Cleveland (1994), a further improved version of this plot is suggested where the square root of the absolute residual is plotted vs fit. The square root transformation removes the skewness when the residuals are normally distributed. In McLeod (1999) it was shown that in some cases we should consider using more general power transformations of the absolute residuals in order to remove skewness and improve the visualization. The idea is to choose a p-th power transformation which removes skewness by examining two panels. The first panel shows for a given power transformation indexed by lambda, the plot of the transformed absolute residual vs. fit along with a robust linear loess smooth using all the data under the tricube smoothing window (ie. alpha=1). The second panel shows a boxplot of the deviations from the smooth in the first panel. The second panel can be used to choose a better transformation if it is needed.
Splus 5 trellis function: source file and help file.
A Mathematica notebook, improved-slplot.nb, containing derivations of the skewness coefficient for p-th powers from various theoretical error distributions and illustrating the use of the Mathematica package for improved spread-location plots, slplot.m , are provided.
The famous ethanol data series originally given by Brinkman (1981) and used in McLeod (1999) is here, ethanol.dat, and a description of the variables, ethanol.txt.
Brinkman, N.D. (1981), Ethanol Fuel -- A single-cylinder engine study of efficiency and exhaust emissions, Society of Automobile Engineers, 80, 1410-1424.
Cleveland, W.S. (1979), ``Robust Locally Weighted Regression and Smoothing Scatterplots'', Journal of the American Statistical Association 74, 829--836.
Cleveland, W.S. (1993), Visualizing Data. Summit, New Jersey: Hobart Press.
McLeod, A.I. (1999), Improved spread-location visualization. Journal of Graphical and Computational Statistics 8, 135-141.