We present a method for monocular kinematic pose estimation and activity recognition from video for movement imitation. Learned vocabularies of kinematic motion primitives are used emulate the function of hypothesized neuroscientific models for spinal fields and mirror neurons in the process of imitation. For imitation, we assume the movement of a demonstrator is produced through a virtual trajectory specifying desired body poses over time. Each pose represents a decomposition into mirror neuron firing coefficients that specify the attractive dynamics to this configuration through a linear combination of primitives. Each primitive is a nonlinear dynamical system that predicts expected motion with respect to an underlying activity. Our aim is to invert this process by estimating a demonstrator’s virtual trajectory from monocular image observations in a bottom-up fashion. At our lowest level, pose estimates are inferred in a modular fashion through the use of a particle filter with each primitive. We hypothesize the likelihood of these pose estimates over time emulate the firing of mirror neurons from the formation of the virtual trajectory. We present preliminary results our method applied to video composed of multiple activities performed at various speeds and viewpoints.