Elizabeth T. Whitaker and Robert L. Simpson, Jr.
Open source intelligence analysts routinely use the web as a source of information related to their specific taskings. Effective information gathering on the web, despite the progress of conventional search engines, is a complex activity requiring some planning, text processing, and interpretation of extracted data to find information relevant to a major intelligence task or subtask (Knoblock, 1995), (Lesser, 1998) and (Nodine, Fowler et al., 2000). This paper describes our design, architecture, and some initial results of next generation information gathering techniques to be used to support the development of tools for intelligence analysts. We are integrating several areas of AI research, especially case-based reasoning, within the Novel Intelligence from Massive Data (NIMD) research program sponsored by the Advanced Research Development Activity. The goal of our research is to develop techniques that take advantage of the vast amounts of information available today on the web so that the web can become a valuable additional resource for the intelligence community. Our solution is a set of domain specific information gathering techniques that produce multi-step plans for gathering information in support of the intelligence analytic process. These plans, when executed, extract relevant information from both unstructured and structured documents and use the extracted information to refine search and processing activities.