Complete co-occurrence data are unavailable in many applications, including purchase records and medical histories, because of their high cost or privacy protection. Even with such applications, aggregated data would be available, such as the number of purchasers for each item and the number of patients with each disease. We propose a method for estimating the co-occurrence of items from aggregated data with auxiliary information. For auxiliary information, we use item features that describe the characteristics of each item. Although many methods have been proposed for estimating the co-occurrence given aggregated data, no existing method can use auxiliary information. We also use records of a small number of users. With our proposed method, we introduce latent co-occurrence variables that represent the amount of co-occurrence for each pair of items. We model a probabilistic generative process of the latent co-occurrence variables by a multinomial distribution with Dirichlet priors. The parameters of the Dirichlet priors are parameterized with neural networks that take the auxiliary information as input, where neural networks are shared across different item pairs. The shared neural networks enable us to learn unknown relationships between auxiliary information and co-occurrence using the data of multiple items. The latent co-occurrence variables and the neural network parameters are estimated by maximizing the sum of the likelihood of the latent co-occurrence variables and the likelihood of the small records. We demonstrate the effectiveness of our proposed method using user-item rating datasets.