15-855 Introduction to Computational Complexity Theory Fall 2015, Lecture 1 (Sep 8, 2015) Course administrivia: - Course staff introduction - Sign up for Piazza https://piazza.com/cmu/fall2015/15855/home We will use Piazza for posting announcements, notes, problem sets. You can use Piazza to ask questions to instructors and fellow students. EVALUATION: - 5-6 biweekly homeworks, one take home midterm, class attendance and participation. Latex typset solutions strongly preferred and encouraged. Collaboration policy will be spelled out in first problem set. Tentativ timings (subject to change): PS1: Sep 10-24 PS2: Sep 24-Oct 8 PS3: Oct 8-22 Midterm: Oct 22-27 (no collaboration allowed) PS4: Oct 27-Nov 10 PS5: Nov 10-24 (note: no lecture on Nov 24) PS6: Dec 1-10. LECTURE 1: Course is a graduate-level introduction to computational complexity theory. The subject aims to understand the power and limitations of computation with finite/limited resources (computability theory: what can be computed with unbounded resources? Complexity theory: What is computationally feasible with limited resources?Classify problems according to computational resources needed (time, space, communication, randomness, parallelism, circuit size, etc.) Two rough kind of topics in the complexity theory broadly. One is the "standard setting": realize a given mapping from inputs to outputs efficiently, with bounded resources (time/space). Have models of computatiojn that represent various capabilities of digital computing devices, including randomness, non-uniformity, parallelism, and quantum effects. Also have models based on nondeterminism, counting, and alternation, precisely capturing the power needed to efficiently computing important types of mappings. The study of this part revolves around intricate relationships between these models, and in (very few cases) separation results. The second is the "modern setting" where one studies computational processes that arise in diverse areas/applications, each with their own relevant efficiency measure, and with many interconnections amongst themselves and to "standard" topics. This includes interactive and probabilistically checkable proofs (verification); pseudorandomness (cryptography, combinatorics, among other applications); learning theory (AI); communication complexity (distributed computing); streaming complexity (big data); property testing (combinatorics, sub-linear computation); query complexity (databases, quantum), etc. These are all substantial research areas in their own right. The course will cover a subset of topics from each of these directions of study within complexity theory, and try to highlight the unity of techniques used. THE CENTRAL QUESTIONS: 1. Is finding a solution as easy as recognizing one? Sudoku solving vs checking. (P = NP?) 2. Is it easy to certify (i.e., give a short proof) whenever a Sudoku puzzle has no solution? (NP=coNP?) 3. Is time indeed more precious than space? (can't reuse time) Are games like chess easy? (P = PSPACE?) 4. Can every efficient algorithm be converted into one that uses a tiny amount of memory? (P = L: the power of constant number of pointers?) Ex: graph connectivity: simple linear time algorithms (DFS) Uses linear space though: Small space algorithms? Directed vs undirected graphs makes a difference (at least as per current understanding) 5. Time space trade-offs * must linear time graph exploration algorithms necessarily use nearly linear space? (NL \neq SC?) * Given time and space lower bounds by themselves are hard, can we prove say SAT can't be solved in linear space and with constant number of pointers? Ans: YES! 6. Can every sequential algorithm be parallelized to obtain substantial speed-up? (P = NC?) 7. Does spot-checking proofs allowing for a small probability of error increase the power of what we can efficiently verify? (YES: PCP(poly,poly)=NEXP) 8. Does interacting with the prover increase the ability of proof verification (YES; if we also allow randomization: IP=PSPACE) 9. Could polynomial size circuits be the panacea and solve even problems that *require* exponential time? (EXP \subset P/poly ?) 10. Can every randomized algorithm be converted into a deterministic one with small slowdown? (P = BPP?) ----------------------------- We believe we know answers to all above questions, but proofs, or even promising approaches, are a far cry. (Which is why complexity is an exciting subject to go into - one can make a real difference if one is creative enough: example Ryan Williams, CMU PhD alum, separated NEXP from ACC_0 a few years back; we will discuss some aspects of this breakthrough later in the semester.) Our fundamental world view will alter dramatically if we are wrong on any of our hunches. A surprising fact: We believe P=NP, P=PSPACE, P=L, EXP in P/poly, P=NC are all false, but there is a belief that P = BPP. (Not that we can even *prove* BPP is strictly weaker than EXP or even NEXP!) Beautiful connection: If SAT or other problems we think are hard are indeed hard (require exponential size circuits), then the belief that there are problems with a randomized polytime algorithm but no deterministic one is FALSE. Thus, the "common" perception/opinion is WRONG on at least one question. The course will try to give a taste of some such surprising connections and interrelationships between various complexity-theoretic phenomena. ----------------------------- With that big picture in mind, let us start with some nitty-gritty. Mostly review material. Need formal notion of a computational problem (such as graph connectivity, shortest math, factoring, coloring maps, satisfiability). Express inputs as strings over Sigma via some encoding. A general search problem is given by a relation R \subset Sigma^* x Sigma^*. Given x, the goal is to find y such that R(x,y) holds. Example: shortest path problem Special case: R is a function, for each x there is at most one y such that (x,y) \in R. Factoring is an example. Further special case, which will be mostly our focus, is a decision problem, where the function is Boolean; f : Sigma^* -> {0,1} (Accept/Reject) Simplification doesn't give up much - can solve search problems efficiently using related decision problem - Caveat: sometimes decision problem is vacuously trivial, but search is not, so there are some subtleties. Nash equilibrium. Finding second Hamiltonian cycle in a 3-regular graph. - Composite vs factoring. (But ask if there exists factor < k). - We will mostly focus on decision problems Languages = subset of strings L subset Sigma^* Decision problem associated with L: given x, is x in L? ----- Complexity class: Set/class of languages. Easy to define classes, but harder to define "meaningful" classes, that capture some genuine computational phenomena, are robust under minor variations of computing model, contain natural and relevant problems, perhaps "complete" problems. Define classes based on model of computation so they capture important aspects of computation. Our model: Multitape Turing machines. Church-Turing thesis - anything computable on a physically-realizable computing device can also be computed by a Turing Machine. Extended Church-Turing thesis - anything efficiently computable on a physically-realizable computing device can be computed by a Turing Machine with polynomial slow down. Challenged by quantum computing, though it is not yet clear if general purpose quantum computation is physically realizable.