Modern multi-agent robotic systems increasingly require scalable, robust and persistent Multi-Agent Path Finding (MAPF) with performance guarantees. While many MAPF solvers that provide some of these properties exist, none provides them all. To fill this need, we propose a new MAPF framework, the shard system. A shard system partitions the workspace into geographic regions, called shards, linked by a novel system of buffers. Agents are routed optimally within a shard by a local controller to local goals set by a global controller. The buffer system novelly allows shards to plan with perfect parallelism, providing scalability. A novel global controller algorithm can rapidly generate an inter-shard routing plan for thousands of agents while minimizing the traffic routed through any shard. A novel workspace partitioning algorithm produces shards small enough to replan rapidly. These innovations allow a shard system to adjust its routing plan in real time if an agent is delayed or assigned a new goal, enabling robust, persistent MAPF. A shard system's local optimality and optimized inter-shard routing bring the sum-of-costs of its solutions to single-shot MAPF problems to < 20-60% of optimal on a diversity of workspaces. Its scalability allows it to plan paths for 1000s of agents in seconds. If any of their goals change or move actions fails, a shard system can replan in under a second.