Padhraic Smyth, Michael Ghil, Kayo Ide, Joe Roden, Andrew Fraser
Low-frequency variability in geopotential height records of the Northern Hemisphere is a topic of significance in atmospheric science, having profound implications for climate modeling and prediction. A debate has existed in the atmospheric science literature as to whether or not "regimes" or clusters exist in geopotential heights, and if so, how many such clusters. This paper tells the detective story of how cross-validated mixture model clustering, a methodology originally described at the 1996 KDD conference (Smyth, 1996), has recently provided clear and objective evidence that three clusters exist in the Northern Hemisphere, where each of the detected clusters has a direct physical interpretation. Cross-validated mixture modeling has thus answered an important open scientific question.