A variety of encoding methods for bag of word (BoW) model have been proposed to encode the local features in image classification. However, most of them are unsupervised and just employ k-means to form the visual vocabulary, thus reducing the discriminative power of the features. In this paper, we propose a metric embedded discriminative vocabulary learning for high-level person representation with application to person re-identification. A new and effective term is introduced which aims at making the same persons closer while different ones farther in the metric space. With the learned vocabulary, we utilize a linear coding method to encode the image-level features (or holistic image features) for extracting high-level person representation. Different from traditional unsupervised approaches, our method can explore the relationship(same or not) among the persons. Since there is an analytic solution to the linear coding, it is easy to obtain the final high-level features. The experimental results on person re-identification demonstrate the effectiveness of our proposed algorithm.