Graphical processing units (GPUs) have become ubiquitous because they offer the ability to perform cost and energy efficient massively parallel computation. We investigate forward search classical planning on GPUs based on Monte Carlo Random Walk (MRW). We first show experimentally that straightforward parallelizations of MRW perform poorly. Next, we propose Batch MRW (BMRW), a generalization of MRW which performs random walks starting with many seed states, in contrast to traditional MRW which used a single seed state. We evaluate a sequential implementation of BMRW on a single CPU core, and show that a sequential, satisficing planner based on BMRW performs comparably with previous state-of-the-art MRW-based planners. Then, we propose BMRWG, which uses a GPU to perform random walks. We show that BMRWG achieves significant speedup compared to BMRW and achieves competitive performance on a number of IPC benchmark domains.