Description

The ubiquity of big data along with modern distributed and cloud computing infrastructures makes it possible to leverage data assets from a variety of sources in order to compute large-scale analytics that serve decision-makers and the public good. A major hurdle to unleashing this potential is the need to trust the entity that gathers the data. Secure Multi-Party Computation (MPC) relies on cryptographic constructs to overcome this hurdle by allowing computation to be performed in a way that reveals only the output result, and reveals nothing about the input or intermediate values used in the computation other than what can be derived from the output. Secure MPC has been an active area of research for over 30 years, with many theoretical results and software artifacts. While fast enough to use today on small-scale data, MPC faces three key challenges that inhibit adoption at scale: the high learning curve of developing private analytics, the challenge of connecting private analytics to existing data stacks, and the inability to balance the privacy and performance provided by the analytic.

This team-taught course aims to prepare/recruit students interested in pursuing research that tackle these challenges by focusing on the deployment of state-of-the-art MPC technologies toward applications with important economic and social justice benefits, such as pay equity, economic stability of the banking system, market diversity to detect monopolies, and distributed network anomaly detection. The course leverages active research and real systems developed at BU by the team of instructors and their students.

We will first review in a lecture/reading group format the mathematical and algorithmic foundations of different MPC frameworks and systems. Next, we will also review some real-world instances of MPC deployment and the associated challenges. Finally, we will break up into project-oriented teams who will set up and deploy MPC systems by applying them to real-world or real-world-inspired data analysis problems. In particular, students will be expected to develop expertise in the use of one or more of the existing MPC libraries/APIs/systems that have been developed over the last few years, leading each student or small groups of students to a course project that will either use these existing MPC capabilities to tackle a large-scale big-data analytics problem, or else integrate these capabilities into the backend of popular big-data analytics platforms. Projects ideas from students that complement and/or augment those proposed by the instructors are welcome!

General Information

Lecture time and location
Mondays 2:30-5:15 PM in MCS 180.

Attendance after 4:30 PM is optional. In particular, this class does not conflict with Adam Smith's CS 591 S1 course on Adaptive Data Analysis.
Participation and grading
Students will be evaluated in this class on the following metrics.
* Attendance at lectures.
* Scribe notes for at least two lectures.
* Participation in the Piazza forums about assigned material to read outside of class.
* The majority of the grade shall come from the student's final project. Project deliverables shall include a report, presentation to the class, and source code.
Collaboration policy
You are welcome and encouraged to discuss your work on Piazza with your classmates, read publications or other resources on the web, and share code with others. But, always remember the following three rules:

1. Your submitted report must be your own writing.
2. Your work must cite all of the sources you consult, including personal conversations with people outside of your project! Be verbose: if you are uncertain whether an interaction is citation-worthy, please include it.
3. You will be graded on your novel contributions. So, while it's encouraged to build upon the work of others, make sure that your contribution is a substantial one!

Violations of this policy will result in a grade reduction. They may also be treated as plagiarism and referred to BU’s Academic Conduct Committee.
Prerequisites
Required: basics of number theory, abstract algebra, and probability theory for CS applications (covered in CS 235 and CS 237); experience with building non-trivial multi-module applications in a modern programming language (such as Python, C, Java, and so on).

Not required, but good to have: familiarity with or interest in distributed/cloud systems (covered in CS 350 or CS 451) and basics of cryptography and network security (covered in CS 538 or CS 558).

Announcements

Course schedule
9/11/17 10:55 AM

DateLecture topicVideo viewingPost-lecture reading assignment
9/11Intro to secure computing: definitions, (threshold) secret sharing, and BGWLindell: MPC definitions (watch 9:09-36:53)
  1. Shen et al: MPC future vision
  2. Thompson: Reflections on Trusting Trust
  3. Lindell, Pinkas: Privacy-preserving data mining
9/18MPC in the 1980s: Yao & GMW, oblivious transferRosulek: GC optimizations
  1. Bestavros et al: Privacy-preserving analytics
  2. Lindell, Pinkas: proof of Yao's Garbled Circuits
  3. Cramer et al: Chapters 1 and 3 (Optional)
9/25Outsourced & distributed computingFuller: Cryptographically protected database searchScalability! But at what COST?
10/2Distributed computing frameworks: Hadoop, Spark, NaiadDerek Murray: Naiad, A Timely Dataflow System
  1. MapReduce: Simplified Data Processing on Large Clusters
  2. DryadNaiad (Optional)
10/10*Malicious security & Frameworks part 1



* Note: class is on a Tuesday

Ivan Damgard: The SPDZ Protocol, Part 1 and Part 2

10/16Frameworks part 2 & ORAMTal Rabin: Oblivious RAM for MPCChung and Pass: A Simple ORAM
10/23Primer on malicious security and usability

Andrei Lapets




https://youtu.be/Q3SP3-YKoc8







Sean Smith
https://www.youtube.com/watch?v=vPXK-Gt0H8A




(Optional/interesting usability talk from same event)

  1. Why Johnny Can't Encrypt
  2. Why Johnny Still Can't Encrypt (Optional)
  3. Why Johnny Still, Still Can't Encrypt (Optional)
  4. Why Doesn't Jane Protect Her Privacy? (Optional)
10/30MPC optimizationsMichael Zohner: ABY- A Framework for Efficient Mixed Protocol Secure Two-Party Computation

Read 1 of 3:


  1. Araki et al's high-throughput MPC
  2. Asharov et al's fast OT extensions
  3. Demmler et al's mixed protocol framework
11/6Conclave: Secure Multi-Party Computation on Big Data & Project discussionCormac Herley: Science, Security, and the Elusive Goal of Security as a Scientific PursuitDraft of paper on conclave
11/13Differential privacy (part 1) & Database search

Hugo Krawczyk: Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries

Optional: Hugo's longer set of videos at the 6th Bar-Ilan Winter School, parts 0 (the intro), 1, 2, and 3

Fuller et al: Cryptographically protected database search
11/20Differential privacy (part 2) & Set intersection(None)Pinkas: Set intersection
11/27Mixed mode 2PC(None)(None)
12/4MPC versus blockchains(None)(None)
12/11Project presentations(None)(None)

Note: some of the paper links are behind paywalls. As a result, you will only be able to read the paper while on the BU network. If you want to get the papers while away from BU, follow the instructions on the IS&T page about VPNs.

#pin

Staff Office Hours
NameOffice Hours
Mayank Varia
When?
Where?
Andrei Lapets
When?
Where?
Azer Bestavros
When?
Where?
Ran Canetti
When?
Where?
Malte Schwarzkopf
When?
Where?