Bay Area
Theory Symposium: BATS
2006
IBM Almaden
Research Center
December
1, 2006
The Bay Area Theory
Symposium (BATS) is an annual one-day event
organized by the theory research groups at UC Berkeley, Stanford, IBM
Almaden, and Microsoft Research Silicon Valley. All interested parties
are invited!
Location and Date
Location: IBM Almaden
Research Center.
PLEASE consult the site's
driving directions (or get to 300
Bernal Road, San Jose, CA and then go straight (up) through the park to
arrive at the IBM gate). Parking is free.
Date: Friday, December 1, 2006,
10:30 am -- 6:00 pm.
Registration
If you plan to come, please email us at the following address (write
your name and affiliation and put "BATS registration" as the subject):
bats@almaden.ibm.com
Please
register by Nov. 23 (Thanksgiving
day), if possible, so
that we
can notify reception and plan for food and beverages. (Those
registering afterwards should still attend.)
Program
Previous
meetings
Please see the BATS permanent webpage for
meetings since the 2001 revival of BATS.
Organizers
Dan Boneh (Stanford), Steve Chien (Microsoft), Robi Krauthgamer (IBM),
Tim Roughgarden (Stanford), Alistair Sinclair (Berkeley)
For local arrangements questions, contact bats@almaden.ibm.com.
Talk Abstracts
Dimitris Achlioptas, UC Santa Cruz
Random Constraint Satisfaction
Problems: from Physics to Algorithms
For a number of random constraint satisfaction problems, such as random
k-SAT and random graph coloring, we now have excellent estimates of the
largest constraints-to-variables ratio for which they have solutions.
These estimates imply that all known polynomial-time algorithms stop
finding solutions much before solutions cease to exist. To
understand
the origin of this phenomenon we study the evolution of the structure
of the space of solutions in random CSPs as constraints are added. In
particular, we will survey both what physicists hypothesize happens in
this evolution, and what has been rigorously proven so far. We
will
see that the rigorous results are both in agreement with the
physics
predictions and help demystify Survey Propagation, an extremely
successful heuristic proposed by physicists for random CSPs.
Ramesh Johari, Stanford University
Interdomain Routing: Algorithmic
Objectives and Economic Constraints
This talk surveys the interaction between economic constraints and
algorithmic objectives in the context of Internet interdomain routing.
Interdomain routing refers to the routing of packets *between*
independently owned and operated networks. Each such network
encodes various economic, business, and performance decisions in a
*routing policy*. As we discuss, the resulting protocol differs
in significant ways from traditional shortest path routing; most
importantly, it may fail to converge.
After describing the interdomain routing problem, we will discuss the
constraints placed on any suggestion for a replacement. Because
the Internet is composed of many independent, competing networks, the
interdomain routing system should provide *autonomy*; i.e., it should
allow network operators to set their policies independently, rather
than requiring joint coordination of routing policies to ensure
convergence. Under this *economic* reality, the
protocol design problem can be reformulated axiomatically: do there
exist stable protocols that respect autonomy? We discuss one
mathematical formulation of this problem and its resolution, and
suggest avenues for future work.
The talk will be entirely self-contained; no prior knowledge of
interdomain routing will be assumed.
This is joint work with Nick Feamster (Georgia Tech) and Hari
Balakrishnan (MIT).
Aranyak Mehta, IBM Almaden Research Center
Design is as Easy as Optimization
Over the last four decades, theoreticians have identified and studied
several fundamental genres of algorithmic problems. These include
decision, search, optimization, counting, enumeration, random
generation, and approximate counting problems.
We identify a new genre of algorithmic problems -- design problems.
Every optimization problem leads to a natural design problem. For
instance, the minimum spanning tree problem leads to: given an
undirected graph $G = (V,E)$ and a bound $B$, find a way of
distributing weight $B$ on the edges of $G$ so that the weight of the
minimum spanning tree is maximized.
Practitioners have always been faced with design problems and such
problems have been studied individually by theoreticians as well.
However, this genre has not been subjected to a systematic
complexity-theoretic study before. Using techniques of Freund-Schapire
and its follow-up works in learning theory, we show that for a large
class of problems, the design version is as easy as the optimization
version in the following sense: given an oracle (exact or
approximation) for the optimization version, the design version can be
solved (exactly or with the same approximation factor) in polynomial
time.
Joint work with Deeparnab Chakrabarty and Vijay V. Vazirani.
Paper available at: http://www-static.cc.gatech.edu/~aranyak/design-fullversion.pdf
Elchanan Mossel, UC Berkeley
Computational Hardness via Gaussian
Isoperimetric Inequalities
In this talk I will review some applications of a new probabilistic
invariance principle derived jointly with R. O'Donnell and K.
Oleszkiewicz for functions that are stable against noise acting
independently on each coordinate in a product space.
This invariance principle in conjunction with a strong "hardness of
approximation" paradigm developed by Khot allows to obtain several
new/tight hardness of approximation results for graph problems such are
MAX-CUT and graph coloring.
Samantha Riesenfeld, UC Berkeley
Sorting algorithms for partial orders
Partial orders arise naturally in a variety of contexts: for
example, in rankings of candidates (job applicants, web pages), one
candidate precedes another if the former is preferable, and in chemical
reaction networks, one substance precedes another if a change in the
concentration of the former directly or indirectly affects the
concentration of the latter. While fundamental problems, such as
sorting, in the domain of total orders are well understood, it seems
that corresponding problems in the domain of partial orders have
generally not been considered before.
Our work focuses on various algorithmic problems -- sorting, insertion,
merging, and selection -- for partially ordered sets of bounded
width. We give algorithms that achieve, or come close to, the
information theoretic lower bounds for the numbers of
comparisons. We also consider computationally efficient
implementations of these algorithms. Finally, we give an
algorithm for the problem of finding a minimum-width decomposition into
chains that, unlike traditional algorithms, does not require a priori
the complete structure of the partial order.
Joint work with Constantinos Daskalakis, Richard Karp, Elchanan Mossel,
and Elad Verbin.
Kunal Talwar, Microsoft
The Price of Privacy and the Limits of
LP Decoding
Suppose one encodes an n-dimensional real vector x as y=Ax, for a
suitably chosen A, and an adversary arbitrarily corrupts some of the
entries of y to get y'. The surprising fact, proved by Donoho, is that
by taking the entries of A as i.i.d. Gaussians, the vector x can be
*exactly* recovered by minimizing the L_1 norm |y' - Ax'| over all x',
provided only a tiny constant fraction of the entries of y were
corrupted. Our principal result is the discovery of a sharp threshhold
rho* ~ 0.239, such that this L_1 minimization succeeds upto any error
rate less than rho*, but fails for rates > rho*. This resolves
an open question of Candes, Rudelson, Tao, and Vershyin.
Our interest in this problem arose while investigating the price, in
accuracy, of protecting privacy in a statistical database. Our results
say that any privacy mechanism, interactive or non-interactive,
providing reasonably accurate answers to a 0.761 fraction of
randomly generated weighted subset sum queries, and arbitrary answers
on the remaining 0.239 fraction, is blatantly non-private.
Joint work with Cynthia Dwork and Frank McSherry.
Sergei Vassilvitskii, Stanford
University
k-means++: The advantages of careful
seeding
The k-means method is a widely used clustering technique that seeks to
minimize the average squared distance between points in the same
cluster. Although it offers no accuracy guarantees, its simplicity and
speed are very appealing in practice. By augmenting k-means with a very
simple, randomized seeding technique, we obtain an algorithm that is
\Theta(log k)-competitive with the optimal clustering. Our experiments
show that this augmentation improves both the speed and the accuracy of
k-means, often quite dramatically.
This is joint work with David Arthur.