Data features usually can be organized in a hierarchical structure to reflect the relations among them. Most of previous studies that utilize the hierarchical structure to help improve the performance of supervised learning tasks can only handle the structure of a limited height such as 2. In this paper, we propose a Deep Hierarchical Structure (DHS) method to handle the hierarchical structure of an arbitrary height with a convex objective function. The DHS method relies on the exponents of the edge weights in the hierarchical structure but the exponents need to be given by users or set to be identical by default, which may be suboptimal. Based on the DHS method, we propose a variant to learn the exponents from data. Moreover, we consider a case where even the hierarchical structure is not available. Based on the DHS method, we propose a Learning Deep Hierarchical Structure (LDHS) method which can learn the hierarchical structure via a generalized fused-Lasso regularizer and a proposed sequential constraint. All the optimization problems are solved by proximal methods where each subproblem has an efficient solution. Experiments on synthetic and real-world datasets show the effectiveness of the proposed methods.