The pancake puzzle is a standard benchmark domain used to test search algorithms, and the gap heuristic is the state-of-the-art heuristic function most often used in such tests. In this work, we analyze the accuracy of this heuristic and identify ways to enhance it. We begin by showing that in the worst-case, the amount that the gap heuristic underestimates the optimal cost of a pancake puzzle state can be linear in the number of pancakes in the stack. However, empirical analysis suggests that it is extremely rare that the gap heuristic underestimates the optimal cost by more than two. We then identify several simple methods that can be used to generate large sets of problems on which the gap heuristic underestimates the optimal cost by a larger amount than it typically does on random permutations. In doing so, we provide new pancake puzzle test sets that can be used to evaluate how search algorithms behave when the heuristic is inaccurate. We also formally characterize states according to the size of the heuristic plateaus around them. This characterization allows us to efficiently compute a two-step look ahead of the gap heuristic on any state, which we can use alongside a state's dual to further improve heuristic accuracy. These enhancements substantially improve the performance of an IDA*-based pancake problem solver on both the existing benchmarks and the new ones proposed in this paper.