Communication efficiency is crucial for federated learning (FL). Conducting local training steps in clients to reduce the communication frequency between clients and the server is a common method to address this issue. However, it leads to the client drift problem due to non-i.i.d. data distributions in different clients which severely deteriorates the performance. In this work, we propose a new method to improve the training performance in cross-silo FL via maintaining double momentum buffers. One momentum buffer tracks the server model updating direction, and the other tracks the local model updating direction. Moreover, we introduce a novel momentum fusion technique to coordinate the server and local momentum buffers. We also provide the first theoretical convergence analysis involving both the server and local standard momentum SGD. Extensive deep FL experimental results show a better training performance than FedAvg and existing standard momentum SGD variants.