In this paper, we present LODsyndesis, a suite of services over the datasets of the entire Linked Open Data Cloud, which offers fast, content-based dataset discovery and object co-reference. Emphasis is given on supporting scalable cross-dataset reasoning for finding all information about any entity and its provenance. Other tasks that can be benefited from these services are those related to the quality and veracity of data since the collection of all information about an entity, and the cross-dataset inference that is feasible, allows spotting the contradictions that exist, and also provides information for data cleaning or for estimating and suggesting which data are probably correct or more accurate. In addition, we will show how these services can assist the enrichment of existing datasets with more features for obtaining better predictions in machine learning tasks. Finally, we report measurements that reveal the sparsity of the current datasets, as regards their connectivity, which in turn justifies the need for advancing the current methods for data integration. Measurements focusing on the cultural domain are also included, specifically measurements over datasets using CIDOC CRM (Conceptual Reference Model), and connectivity measurements of British Museum data. The services of LODsyndesis are based on special indexes and algorithms and allow the indexing of 2 billion triples in around 80 min using a cluster of 96 computers.