While deep learning demonstrates its strong ability to handle independent and identically distributed (IID) data, it often suffers from out-of-distribution (OoD) generalization, where the test data come from another distribution (w.r.t. the training one). Designing a general OoD generalization framework for a wide range of applications is challenging, mainly due to different kinds of distribution shifts in the real world, such as the shift across domains or the extrapolation of correlation. Most of the previous approaches can only solve one specific distribution shift, leading to unsatisfactory performance when applied to various OoD benchmarks. In this work, we propose DecAug, a novel decomposed feature representation and semantic augmentation approach for OoD generalization. Specifically, DecAug disentangles the category-related and context-related features by orthogonalizing the two gradients (w.r.t. intermediate features) of losses for predicting category and context labels, where category-related features contain causal information of the target object, while context-related features cause distribution shifts between training and test data. Furthermore, we perform gradient-based augmentation on context-related features to improve the robustness of learned representations. Experimental results show that DecAug outperforms other state-of-the-art methods on various OoD datasets, which is among the very few methods that can deal with different types of OoD generalization challenges.