Few existing face recognition (FR) models take local representations into account. Although some works achieved this by extracting features on cropped parts around face landmarks, landmark detection may be inaccurate or even fail in some extreme cases. Recently, without relying on landmarks, attention-based networks can focus on useful parts automatically. However, there are two issues: 1) It is noticed that these approaches focus on few facial parts, while missing other potentially discriminative regions. This can cause performance drops when emphasized facial parts are invisible under heavy occlusions (e.g. face masks) or large pose variations; 2) Different facial parts may appear at various quality caused by occlusion, blur, or illumination changes. In this paper, we propose contrastive quality-aware attentions, called CQA-Face, to address these two issues. First, a Contrastive Attention Learning (CAL) module is proposed, pushing models to explore comprehensive facial parts. Consequently, more useful parts can help identification if some facial parts are invisible. Second, a Quality-Aware Network (QAN) is developed to emphasize important regions and suppress noisy parts in a global scope. Thus, our CQA-Face model is developed by integrating the CAL with QAN, which extracts diverse quality-aware local representations. It outperforms the state-of-the-art methods on several benchmarks, demonstrating its effectiveness and usefulness.