Partial differential equations (PDEs) play a prominent role in many disciplines for describing the governing systems of interest. Traditionally, PDEs are derived based on first principles. In the era of big data, the needs of uncovering PDEs from massive data-set are emerging and become essential. One of the latest advance in PDE discovery models is PDE-Net, which has shown promising predictive power with its moment-constrained convolutional filters, but may suffer from noisy data and numerical instability intrinsic in numerical differentiation. We propose a novel and robust regularization method tailored for moment-constrained convolutional filters, namely, Differential Spectral Normalization (DSN), to allow accurate estimation of coefficient functions and stable prediction of dynamics in a long time horizon. We investigated the effectiveness of DSN against batch normalization, dropout, spectral normalization, weight decay, weight normalization, jacobian regularization and orthonormal regularization and supported with empirical evidence that DSN owns the highest effectiveness by learning the convolutional filters in a robust manner. Numerical experiments further reveal that with DSN there is a substantial potential to uncover the hidden PDEs in a scarce data setting and predict the dynamical behavior for a long time horizon, even in a noisy environment where all data samples are contaminated with noise.