Different evaluation measures assess different characteristics of machine learning algorithms. The empirical evaluation of algorithms and classifiers is a matter of on-going debate between researchers. Although most measures in use today focus on a classifier's ability to identify classes correctly, we suggest that, in certain cases, other properties, such as failure avoidance or class discrimination may also be useful. We suggest the application of measures which evaluate such properties. These measures - Youden's index, likelihood, Discriminant power - are used in medical diagnosis. We show that these measures are interrelated, and we apply them to a case study from the field of electronic negotiations. We also list other learning problems which may benefit from the application of the proposed measures.