<< Chapter < Page | Chapter >> Page > |
(This module helps introduce The Good Book: Thirty Years of Comments, Conjectures and Conclusions, by I.J. Good . The book is available for purchase from the Rice University Press Store . You can also visit the Rice University Press web site .)
It is a pleasure to review Jack Good's numerous contributions to the theory and practice of modern statistics.Here, we wish to remember his innovations in the field of nonparametric density estimation.Together with his student R. A. Gaskins, Jack invented penalized likelihood density estimation (Good and Gaskins, 1971).Given the computing resources available at that time, the implementation was truly revolutionary. A Fourier seriesapproximation was introduced, not with just a few terms, but ofttimes thousands of terms. To address the issue ofnonnegativity, the authors solved for the square root of the density. The penalty functions described were ${L}_{2}$ norms of the first and second derivatives of the density's square root.
The first author had the pleasure of attending a lecture by Jack at one of the early Southern Research Conference on Statisticsmeetings and returned to Rice University with a number of questions. For example, isthe square root “trick” valid? Could a closed-form solution be found? Considering such questions led to collaborationswith numerical analyst Richard Tapia and theses by Gilbert de Montricher and the second author. Gilbert wasable to show that the first derivative penalty could be solved in closed form. (Klonias [1982]later provided a wider set of solutions.) But Gilbert also showed that the squareroot trick does not work in general in infinite-dimensional Hilbert spaces, such as those considered here. Scott (1976)examined a finite-dimensional approximation for which the square-root trick does apply. These research findings werecollected in Tapia and Thompson (1978), one of the first surveys of nonparametric density estimation.In this and other venues, Jack's pioneering work led to a large body of research based on splines andother bases.
Jack's inspiration came at a very fortuitous time for statisticians at Rice. NASA funding had switched from an emphasis on space exploration to that ofagricultural intelligence gathering via remote sensing. (Thompson well remembers Jack in his IDA days walking around Princetonin a trenchcoat, affecting the pose of George Smiley. So Jack might appreciate what follows below.)The idea was to identify and exploit shortages in Soviet grain production.
The NASA prototype solution in 1970 used a huge and clunky multi-spectral scanner that recorded groundreflectivity in twelve channels. This involved flyovers in Kansas from large aircraft.Misclassification rates were running around 25% using the assumption the data were multivariate Gaussian.The solution (before we got into the problem) was to expand the hardware to an even larger twenty-four-channel device.NASA had not run into the heavy-tailed pathologies dealt with by the Princeton Robustness Project, but rather into themixture of distributions problem which the Princetonians did not address. Of course, for the mixture problem under theGaussian assumption, things get worse as the number of channels increases. Thompson was somewhat amazed to find during adrive around in the summer of 1971 that the LARYS group at Purdue and the Willow Run group at Michigan were also treatingthe data as though they were Gaussian.
Notification Switch
Would you like to follow the 'Introductory material to the good book: thirty years of comments, conjectures and conclusions' conversation and receive update notifications?