A conversation is a sequence of utterances, where each utterance plays a specific discourse role while expressing a particular emotion. This paper proposes a novel method to exploit latent discourse role information of an utterance to determine the emotion it conveys in a conversation. Specifically, we use a variant of the Variational-Autoencoder (VAE) to model the context-aware latent discourse roles of each utterance in an unsupervised way. The latent discourse role representation further equips the utterance representation with a salient clue for more accurate emotion recognition. Our experiments show that our proposed method beats the best-reported performances on three public Emotion Recognition in Conversation datasets. This proves that the discourse role information of an utterance plays an important role in the emotion recognition task, which no previous work has studied.