Causal knowledge is vital for effective reasoning in science, as causal relations, unlike correlations, allow one to reason about the outcomes of interventions. Algorithms that can discover causal relations from observational data are based on the assumption that all variables have been jointly measured in a single dataset. In many cases this assumption fails. Previous approaches to overcoming this shortcoming devised algorithms that returned all joint causal structures consistent with the conditional independence information contained in each individual dataset. But, as conditional independence tests only determine causal structure up to Markov equivalence, the number of consistent joint structures returned by these approaches can be quite large. The last decade has seen the development of elegant algorithms for discovering causal relations beyond conditional independence, which can distinguish among Markov equivalent structures. In this work we adapt and extend these so-called bivariate causal discovery algorithms to the problem of learning consistent causal structures from multiple datasets with overlapping variables belonging to the same generating process, providing a sound and complete algorithm that outperforms previous approaches on synthetic and real data.