While the task of sensing and perceiving the visual environment as we go about our daily lives is trivial for most humans, attempts to emulate the principles underlying human vision in machine vision systems have only been marginally successful. Attention, mediated by eye movements, acts as the critical gateway to visual cognition by searching for areas with relevant information and selecting the stimuli that will be processed. These stimuli are subsequently recognized as parts of visual world and stitched in spatial and temporal representations eventually leading us to produce an adaptive, composite impression of surroundings in near real-time. In this talk, we will outline recent efforts from our group to translate the knowledge of human vision system into computational models towards realizing intelligent seeing machines. We will describe specific computational models for visual attention, object recognition and spatio-temporal recognition, and present some preliminary results using these models on public domain computer vision datasets.