Description
This project-based on-line course focuses on skill building across various aspects of cloud computing. We cover conceptual topics and provide hands-on experience through projects utilizing public cloud infrastructures Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP).
Students will utilize MapReduce, interactive programming using Jupyter Notebooks, and data science libraries to clean, prepare and analyze a large data set. Students will orchestrate the deployment of auto-scaled, load-balanced and fault-tolerant applications using virtual machines (VMs), Docker containers and Kubernetes, as well as serverless computing through Functions as a Service. Students will explore and experiment with different distributed cloud-storage abstractions (distributed file systems and databases) and compare their features, capabilities, applicability and consistency models. In addition, students will develop different analytics applications using batch, iterative and stream processing frameworks. The 15-619 [graduate] students will participate in a team project, which entails designing and implementing a complete web-service solution for querying big data. For the team project, the student teams are evaluated based on the cost and performance of their web service.
Conceptually, the course will introduce this domain and cover the topics of cloud infrastructures, virtualization, software-defined networks and storage, cloud storage, and programming models (analytics frameworks). As an introduction, we will discuss the motivating factors, benefits and challenges of the cloud, as well as service models, service level agreements (SLAs), security, example Cloud Service Providers and use cases. Modern data centers enable many of the economic and technological benefits of the cloud paradigm; hence, we will describe several concepts behind data center design and management and software deployment. Next, we will focus on virtualization as a key cloud technique for offering software, computation and storage services. Within the same theme of virtualization, students will also be introduced to Software Defined Networks and Storage (SDN and SDS). Subsequently, students will learn about different cloud storage concepts including data distribution, durability, consistency and redundancy. We will discuss distributed file systems, NoSQL databases and object storage. Finally, students will learn the details of the MapReduce programming model and gain a broad overview of the Spark, GraphLab programming models as well as message queues (Kafka) and stream processing (Samza).
Students will utilize MapReduce, interactive programming using Jupyter Notebooks, and data science libraries to clean, prepare and analyze a large data set. Students will orchestrate the deployment of auto-scaled, load-balanced and fault-tolerant applications using virtual machines (VMs), Docker containers and Kubernetes, as well as serverless computing through Functions as a Service. Students will explore and experiment with different distributed cloud-storage abstractions (distributed file systems and databases) and compare their features, capabilities, applicability and consistency models. In addition, students will develop different analytics applications using batch, iterative and stream processing frameworks. The 15-619 [graduate] students will participate in a team project, which entails designing and implementing a complete web-service solution for querying big data. For the team project, the student teams are evaluated based on the cost and performance of their web service.
Conceptually, the course will introduce this domain and cover the topics of cloud infrastructures, virtualization, software-defined networks and storage, cloud storage, and programming models (analytics frameworks). As an introduction, we will discuss the motivating factors, benefits and challenges of the cloud, as well as service models, service level agreements (SLAs), security, example Cloud Service Providers and use cases. Modern data centers enable many of the economic and technological benefits of the cloud paradigm; hence, we will describe several concepts behind data center design and management and software deployment. Next, we will focus on virtualization as a key cloud technique for offering software, computation and storage services. Within the same theme of virtualization, students will also be introduced to Software Defined Networks and Storage (SDN and SDS). Subsequently, students will learn about different cloud storage concepts including data distribution, durability, consistency and redundancy. We will discuss distributed file systems, NoSQL databases and object storage. Finally, students will learn the details of the MapReduce programming model and gain a broad overview of the Spark, GraphLab programming models as well as message queues (Kafka) and stream processing (Samza).
General Information
Course Web Page
TheProject.Zone
Canvas Course
Name | Office Hours | |
---|---|---|
Majd F. Sakr | When? Where? | |
Yan Shen | When? Where? | |
Eric Song | When? Where? | |
Marshall An | When? Where? | |
Cameron | When? Where? | |
peiyao zhou | When? Where? | |
Apoorv Gupta | When? Where? | |
Sriharsha Bandaru | When? Where? | |
Zexian Wang | When? Where? | |
Zizhe Liu | When? Where? | |
Zhiyang Liu | When? Where? | |
Ye Li | When? Where? | |
Joey Pinto | When? Where? | |
Shaleen Kumar Gupta | When? Where? | |
Dachi Chen | When? Where? | |
Chang Xu | When? Where? | |
Siang Gao | When? Where? | |
Sai Kiriti Badam | When? Where? | |
Sahil Hasan | When? Where? | |
Xuannan Su | When? Where? | |
Siddharth Kandimalla | When? Where? | |
Siddharth Kandimalla | When? Where? | |
The Project Zone | When? Where? |