ABSTRACT

Since its emergence, cloud computing has been massively adopted due to the scalability, fault-tolerance, and elasticity features it offers. Cloud-based platforms free the application developer from the burden of administering the hardware and provide resilience to failures, as well as elastic scaling up and down of resources according to the demand. The recent development of such environments has a significant impact on the data management research community, in which the cloud provides a distributed, shared-nothing infrastructure for scalable data storage and processing. Many recent works have focused on the performance and cost analysis of cloud platforms, and on the extension of the services that they provide. For instance, [109] focuses on extending public cloud services with basic database primitives, while extensions for the MapReduce paradigm [168] are proposed in [22] for efficient parallel processing of queries in cloud infrastructures.