The development of automatic natural language understanding systems remains an elusive goal. Given the highly ambiguous nature of the syntax and semantics of natural language, it is not possible to develop rule-based approaches to understanding even very limited domains of text. The difficulty in specifying a complete set of rules and their exceptions has led to the rise of probabilistic approaches where models of natural language are learned from large corpora of text. However, this has proven a challenge since natural language data is both sparse and skewed and the space of possible models is huge. In this paper we discuss several search techniques used in learning the structure of probabilistic models of word sense disambiguation. We present an experimental comparison of backward and forward sequential searches as well as a model averaging approach to the problem of resolving the meaning of ambiguous words in text.