We consider an application of scheduling to hardware-accelerated functional verification, a massively-parallel computational paradigm used in the simulation of complex integrated circuits. Our domain requires the compilation of logical primitives into a set of instruction memories that optimize the concurrency and communication between tightly synchronized processing units. The scheduling process is burdened by a complex model in which all logical dependencies must be resolved by a dynamic network of routes that compete for sparsely distributed resources. We describe a series of optimization steps that cooperate to minimize simulation depth while scaling to problem sizes on the order of a billion gates. Our approach targets an industrial acceleration architecture containing 262,144 parallel processors.