15319/15619 | Class Profile

Course Information
Staff

Description

This project-based on-line course focuses on skill building across various aspects of cloud computing. We cover conceptual topics and provide hands-on experience through projects utilizing public cloud infrastructures Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP).
Students will utilize MapReduce, interactive programming using Jupyter Notebooks, and data science libraries to clean, prepare and analyze a large data set. Students will orchestrate the deployment of auto-scaled, load-balanced and fault-tolerant applications using virtual machines (VMs), Docker containers and Kubernetes, as well as serverless computing through Functions as a Service. Students will explore and experiment with different distributed cloud-storage abstractions (distributed file systems and databases) and compare their features, capabilities, applicability and consistency models. In addition, students will develop different analytics applications using batch, iterative and stream processing frameworks. The 15-619 [graduate] students will participate in a team project, which entails designing and implementing a complete web-service solution for querying big data. For the team project, the student teams are evaluated based on the cost and performance of their web service.
Conceptually, the course will introduce this domain and cover the topics of cloud infrastructures, virtualization, software-defined networks and storage, cloud storage, and programming models (analytics frameworks). As an introduction, we will discuss the motivating factors, benefits and challenges of the cloud, as well as service models, service level agreements (SLAs), security, example Cloud Service Providers and use cases. Modern data centers enable many of the economic and technological benefits of the cloud paradigm; hence, we will describe several concepts behind data center design and management and software deployment. Next, we will focus on virtualization as a key cloud technique for offering software, computation and storage services. Within the same theme of virtualization, students will also be introduced to Software Defined Networks and Storage (SDN and SDS). Subsequently, students will learn about different cloud storage concepts including data distribution, durability, consistency and redundancy. We will discuss distributed file systems, NoSQL databases and object storage. Finally, students will learn the details of the MapReduce programming model and gain a broad overview of the Spark, GraphLab programming models as well as message queues (Kafka) and stream processing (Samza).

General Information

Course Web Page

https://www.cs.cmu.edu/~msakr/15619-s19/

TheProject.Zone

https://theproject.zone

Canvas Course

https://canvas.cmu.edu/courses/8869

Calendar of TA Office Hours

https://calendar.google.com/calendar/embed?src=es85648jqofmrnnlnrup3nbuus%40group.calendar.google.com&ctz=America/New_York

Announcements

Announcements are not public for this course.

Staff Office Hours

Majd F. Sakr

Yan Shen

Eric Song

Marshall An

Cameron

peiyao zhou

Apoorv Gupta

Sriharsha Bandaru

Zexian Wang

Zizhe Liu

Zhiyang Liu

Ye Li

Joey Pinto

Shaleen Kumar Gupta

Dachi Chen

Chang Xu

Siang Gao

Sai Kiriti Badam

Sahil Hasan

Xuannan Su

Siddharth Kandimalla

The Project Zone