The main contribution of this paper is to suggest a novel technique for automatic creation of accurate ensembles. The technique proposed, named GEMS, first trains a large number of neural networks (here either 20 or 50) and then uses genetic programming to build the ensemble by combining available networks. The use of genetic programming makes it possible for GEMS to not only consider ensembles of very different sizes, but also to use ensembles as intermediate building blocks which could be further combined into larger ensembles. To evaluate the performance, GEMS is compared to different ensembles where networks are selected based on individual test set accuracy. The experiments use four publicly available data sets and the results are very promising for GEMS. On two data sets, GEMS has significantly higher accuracy than the competing ensembles, while there is no significant difference on the other two.