Criteria for evaluating the performance of a classifier are an important part in its design. They allow to estimate the behavior of the generated classifier on unseen data and can be also used to compare its performance against the performance of classifiers generated by other classification algorithms. There are currently several performance measures for binary and flat classification problems. For hierarchical classification problems, where there are multiple classes which are hierarchically related, the evaluation step is more complex. This paper reviews the main evaluation metrics proposed in the literature to evaluate hierarchical classification models.