An active vision system capable of understanding and learning about a dynamic scene is presented. The system is active since it makes a purposive use of a monocular sensor to satisfy a set of visual tasks. The message conveyed by the paper follows the thesis that learning is indispensable to the vision process to detect expected and unexpected situations especially when the monitoring of scene dynamics is employed. In such cases fast and efficient learning strategies are required to counterbalance the unsatifactory performance of standard vision techniques. The paper presents two distinct ways of learning. The system can learn about the geometry and dynamics of the scene in the usual active vision sense, by purposively constructing and continuously updating a model of the scene. This results in an incremental improvement of the performance of the vision process. The system can also learn about new objects, by constructing their models on the basis of recognised object features and use the models to predict unwanted situations. We suggest that the vision process benefits from the use of techniques for extracting scene characteristics and creating object models. As a consequence of the large variety of existing object classes a pre-compiled modelling of complex-shaped objects is unrealistic. Moreover, it is difficult to predict the presence and dynamics of all objects which may appear in the scene. Even if the creation of pre-compiled models for complex objects was feasible, the required recognition mechanisms would be slow and presumably inefficient.