Information gathering agents are required in many software agent applications to answer queries, posed by other agents, using a variety of available information sources. We formally consider the problem of designing information gathering agents, and make two important contributions. First, we examine the key issue of integrating knowledge from external sites into our knowledge base, and present an expressive language for this purpose. A noteworthy feature of our language is its ability to capture the knowledge that some external sites have complete information of a certain kind, using rich semantic constraints. Given a query on the knowledge base, it is important for the agent to first determine the set of external sites that contain information relevant to answering the query, and then access those sites. Our second contribution is to show that, given a query and the descriptions of the external sites in our language, it is possible to determine minimal subsets of sites that are needed to answer the query.