Conventional reservoir computing (RC) is a shallow recurrent neural network (RNN) with fixed high dimensional hidden dynamics and one trainable output layer. It has the nice feature of requiring limited training which is critical for certain applications where training data is extremely limited and costly to obtain. In this paper, we consider two ways to extend the shallow architecture to deep RC to improve the performance without sacrificing the underlying benefit: (1) Extend the output layer to a three layer structure which promotes a joint time-frequency processing to neuron states; (2) Sequentially stack RCs to form a deep neural network. Using the new structure of the deep RC we redesign the physical layer receiver for multiple-input multiple-output with orthogonal frequency division multiplexing (MIMO-OFDM) signals since MIMO-OFDM is a key enabling technology in the 5th generation (5G) cellular network. The combination of RNN dynamics and the time-frequency structure of MIMO-OFDM signals allows deep RC to handle miscellaneous interference in nonlinear MIMO-OFDM channels to achieve improved performance compared to existing techniques. Meanwhile, rather than deep feedforward neural networks which rely on a massive amount of training, our introduced deep RC framework can provide a decent generalization performance using the same amount of pilots as conventional model-based methods in 5G systems. Numerical experiments show that the deep RC based receiver can offer a faster learning convergence and effectively mitigate unknown non-linear radio frequency (RF) distortion yielding twenty percent gain in terms of bit error rate (BER) over the shallow RC structure.