Bay Area Theory Symposium: BATS 2006

IBM Almaden Research Center
December 1, 2006

The Bay Area Theory Symposium (BATS) is an annual one-day event organized by the theory research groups at UC Berkeley, Stanford, IBM Almaden, and Microsoft Research Silicon Valley. All interested parties are invited!

Location and Date

Location: IBM Almaden Research Center. PLEASE consult the site's driving directions (or get to 300 Bernal Road, San Jose, CA and then go straight (up) through the park to arrive at the IBM gate). Parking is free.

Date: Friday, December 1, 2006, 10:30 am -- 6:00 pm.

Registration

If you plan to come, please email us at the following address (write your name and affiliation and put "BATS registration" as the subject):

bats@almaden.ibm.com

Please register by Nov. 23 (Thanksgiving day), if possible, so that we can notify reception and plan for food and beverages. (Those registering afterwards should still attend.)


Program


10:30 - 11:00
Coffee and pastries
11:00 - 12:00
Ramesh Johari, Stanford University
Interdomain Routing: Algorithmic Objectives and Economic Constraints
12:00 - 12:30
Samantha Riesenfeld, UC Berkeley
Sorting algorithms for partial orders
12:30 - 2:00
Lunch (provided)
2:00 - 2:30
Aranyak Mehta, IBM Almaden
Design is as Easy as Optimization
2:30 - 3:30
Dimitris Achlioptas, UC Santa Cruz
Random Constraint Satisfaction Problems: from Physics to Algorithms
3:30 - 4:00
Sergei Vassilvitskii, Stanford University
k-means++: The advantages of careful seeding
4:00 - 4:30
Coffee break
4:30 - 5:00
Kunal Talwar, Microsoft
The Price of Privacy and the Limits of LP Decoding
5:00 - 6:00
Elchanan Mossel, UC Berkeley
Computational Hardness via Gaussian Isoperimetric Inequalities


Previous meetings

Please see the BATS permanent webpage for meetings since the 2001 revival of BATS.

Organizers

Dan Boneh (Stanford), Steve Chien (Microsoft), Robi Krauthgamer (IBM), Tim Roughgarden (Stanford), Alistair Sinclair (Berkeley)
For local arrangements questions, contact bats@almaden.ibm.com.

Talk Abstracts

Dimitris Achlioptas, UC Santa Cruz

Random Constraint Satisfaction Problems: from Physics to Algorithms

For a number of random constraint satisfaction problems, such as random k-SAT and random graph coloring, we now have excellent estimates of the largest constraints-to-variables ratio for which they have solutions. These estimates imply that all known polynomial-time algorithms stop finding solutions much before solutions cease to  exist. To understand the origin of this phenomenon we study the evolution of the structure of the space of solutions in random CSPs as constraints are added. In particular, we will survey both what physicists hypothesize happens in this evolution, and what has been  rigorously proven so far. We will see that the rigorous results are  both in agreement with the physics predictions and help demystify Survey Propagation, an extremely successful heuristic proposed by physicists for random CSPs.


Ramesh Johari, Stanford University

Interdomain Routing: Algorithmic Objectives and Economic Constraints

This talk surveys the interaction between economic constraints and algorithmic objectives in the context of Internet interdomain routing. Interdomain routing refers to the routing of packets *between* independently owned and operated networks.  Each such network encodes various economic, business, and performance decisions in a *routing policy*.  As we discuss, the resulting protocol differs in significant ways from traditional shortest path routing; most importantly, it may fail to converge.

After describing the interdomain routing problem, we will discuss the constraints placed on any suggestion for a replacement.  Because the Internet is composed of many independent, competing networks, the interdomain routing system should provide *autonomy*; i.e., it should allow network operators to set their policies independently, rather than requiring joint coordination of routing policies to ensure convergence.    Under this *economic* reality, the protocol design problem can be reformulated axiomatically: do there exist stable protocols that respect autonomy?  We discuss one mathematical formulation of this problem and its resolution, and suggest avenues for future work.

The talk will be entirely self-contained; no prior knowledge of interdomain routing will be assumed.

This is joint work with Nick Feamster (Georgia Tech) and Hari Balakrishnan (MIT).

Aranyak Mehta, IBM Almaden Research Center

Design is as Easy as Optimization

Over the last four decades, theoreticians have identified and studied several fundamental genres of algorithmic problems. These include decision, search, optimization, counting, enumeration, random generation, and approximate counting problems.

We identify a new genre of algorithmic problems -- design problems. Every optimization problem leads to a natural design problem. For instance, the minimum spanning tree problem leads to: given an undirected graph $G = (V,E)$ and a bound $B$, find a way of distributing weight $B$ on the edges of $G$ so that the weight of the minimum spanning tree is maximized.

Practitioners have always been faced with design problems and such problems have been studied individually by theoreticians as well. However, this genre has not been subjected to a systematic complexity-theoretic study before. Using techniques of Freund-Schapire and its follow-up works in learning theory, we show that for a large class of problems, the design version is as easy as the optimization version in the following sense: given an oracle (exact or approximation) for the optimization version, the design version can be solved (exactly or with the same approximation factor) in polynomial time.

Joint work with Deeparnab Chakrabarty and Vijay V. Vazirani.
Paper available at: http://www-static.cc.gatech.edu/~aranyak/design-fullversion.pdf

Elchanan Mossel, UC Berkeley

Computational Hardness via Gaussian Isoperimetric Inequalities

In this talk I will review some applications of a new probabilistic invariance principle derived jointly with R. O'Donnell and K. Oleszkiewicz for functions that are stable against noise acting independently on each coordinate in a product space.

This invariance principle in conjunction with a strong "hardness of approximation" paradigm developed by Khot allows to obtain several new/tight hardness of approximation results for graph problems such are MAX-CUT and graph coloring.

Samantha Riesenfeld, UC Berkeley

Sorting algorithms for partial orders

Partial orders arise naturally in a variety of contexts:  for example, in rankings of candidates (job applicants, web pages), one candidate precedes another if the former is preferable, and in chemical reaction networks, one substance precedes another if a change in the concentration of the former directly or indirectly affects the concentration of the latter.  While fundamental problems, such as sorting, in the domain of total orders are well understood, it seems that corresponding problems in the domain of partial orders have generally not been considered before.

Our work focuses on various algorithmic problems -- sorting, insertion, merging, and selection -- for partially ordered sets of bounded width.  We give algorithms that achieve, or come close to, the information theoretic lower bounds for the numbers of comparisons.  We also consider computationally efficient implementations of these algorithms.  Finally, we give an algorithm for the problem of finding a minimum-width decomposition into chains that, unlike traditional algorithms, does not require a priori the complete structure of the partial order.
 
Joint work with Constantinos Daskalakis, Richard Karp, Elchanan Mossel, and Elad Verbin.

Kunal Talwar, Microsoft

The Price of Privacy and the Limits of LP Decoding

Suppose one encodes an n-dimensional real vector x as y=Ax, for a suitably chosen A, and an adversary arbitrarily corrupts some of the entries of y to get y'. The surprising fact, proved by Donoho, is that by taking the entries of A as i.i.d. Gaussians, the vector x can be *exactly* recovered by minimizing the L_1 norm |y' - Ax'| over all x', provided only a tiny constant fraction of the entries of y were corrupted. Our principal result is the discovery of a sharp threshhold rho* ~ 0.239, such that this L_1 minimization succeeds upto any error rate less than rho*, but fails for rates > rho*.  This resolves an open question of Candes, Rudelson, Tao, and Vershyin.

Our interest in this problem arose while investigating the price, in accuracy, of protecting privacy in a statistical database. Our results say that any privacy mechanism, interactive or non-interactive, providing  reasonably accurate answers to a 0.761 fraction of randomly generated weighted subset sum queries, and arbitrary answers on the remaining 0.239 fraction, is blatantly non-private.

Joint work with Cynthia Dwork and Frank McSherry.

Sergei Vassilvitskii, Stanford University

k-means++: The advantages of careful seeding

The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is \Theta(log k)-competitive with the optimal clustering. Our experiments show that this augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

This is joint work with David Arthur.