Mining Sequences in Distributed Sensors Data for Energy Production

John Gant, Mehmed Kantardzic

The desire to predict power generation at a given point in time is essential to power scheduling, energy trading, and availability modeling. The research conducted within is concerned with sequence mining on power generation data and has the intent of modeling power generation. The data streams analyzed are average hourly power generation that is reported to the EPA. A global statistical model is proven impractical for the data streams, and local modeling via sequence mining is performed. The methodology presented, Uniform Sequence Discovery, implements the idea of uniform population coding, stream mining, and cross-stream mining. 1671 streams from years 2002 through 2004 are coded, mined for sequences, and cross-mined for matching sequences. 486 and 270 frequent sequences were extracted from the learning and testing data respectively. Association rules and the accompanying confidence and support values are used to create local models for power generation prediction. 159 local models were confirmed in the testing phase with a minimum confidence of 0.60. Power traders, concerned with predicting available generation, would then use the local models for prediction of natural gas-fired power generation.

Subjects: 12. Machine Learning and Discovery; 1. Applications

Submitted: Feb 10, 2007


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.