Modeling complex hierarchical and grouped feature interaction in the multivariate time series data is indispensable to comprehend the data dynamics and predicting the future condition. The implicit feature interaction and high-dimensional data make multivariate forecasting very challenging. Many existing works did not put more emphasis on exploring explicit correlation among multiple time series data, and complicated models are designed to capture long- and short-range pattern with the aid of attention mechanism. In this work, we think that pre-defined graph or general learning method is difficult due to their irregular structure. Hence, we present CATN, an end-to-end model of Cross Attentive Tree-aware Network to jointly capture the inter-series correlation and intra-series temporal pattern. We first construct a tree structure to learn hierarchical and grouped correlation and design an embedding approach that can pass dynamic message to generalize implicit but interpretable cross features among multiple time series. Next in temporal aspect, we propose a multi-level dependency learning mechanism including global&local learning and cross attention mechanism, which can combine long-range dependencies, short-range dependencies as well as cross dependencies at different time steps. The extensive experiments on different datasets from real world show the effectiveness and robustness of the method we proposed when compared with existing state-of-the-art methods.