Good problem solving knowledge for real life domains is hard to define in a single representation. In some situations, a direct policy is a better choice while in others, value function is better. Typically, direct policy representation is better suited to strategic level plans, while value function representation is better suited to tactical level plans. We propose a hybrid hierarchical representation machine (HHRM) where direct policy representation and value function based representation can co-exist in a level-wise fashion. We provide simple learning and planning algorithms with our new representation and discuss their application to Airspace Deconfliction domain. In our experiments, we provided our system LSP with two level HHRM for the domain. LSP could successfully learn from limited number of experts’ solution traces and show superior performance compared to average of human novice learners.