Overview
Our recent work on algorithms is in the following areas: data stream algorithms, machine learning and optimization, geometric algorithms, ranking and aggregation algorithms, and algorithms on graphs.
Data stream algorithms
Estimating the sortedness of a data stream. The distance to monotonicity of a sequence is the minimum number of edit operations required to transform the sequence into an increasing order; this measure is complementary to the length of the longest increasing subsequence (LIS). Parikshit Gopalan, T. S. Jayram, Robert Krauthgamer and Ravi Kumar address the question of estimating these quantities in the one-pass data stream model and present the first sub-linear space algorithms for both problems. Their main result is a randomized algorithm that uses only polylogarithmic space and approximates the distance to monotonicity to within a factor that is arbitrarily close to 4.
- P.Gopalan, T. S. Jayram, R. Krauthgamer and R. Kumar. Estimating the Sortedness of a Data Stream. Symposium on Discrete Algorithms (SODA), 2007
Estimating statistical aggregates on probabilistic data streams.
The probabilistic-stream model was introduced by Jayram et al. in 2007.
It is a generalization of the data stream model that is suited to
handling probabilistic data. Designing efficient aggregation algorithms
for probabilistic data is crucial for handling uncertainty in
data-centric applications such as OLAP, and other setting such as
analyzing search engine traffic and aggregation in sensor networks.
T.S. Jayram, Andrew McGregor (UCSD), S. Muthukrishan (Google Research),
and Erik Vee (previously a postdoc at IBM, and currently at Yahoo!
Research) present new algorithms for computing commonly used aggregates
on a probabilistic stream. They present the first one-pass streaming
algorithms for estimating the expected mean of a probabilistic stream,
which also exponentially improves the space bounds in the previous
multi-pass algorithm. Their work also considers the problem of
estimating frequency moments for probabilistic streams. They propose a
general approach to obtain unbiased estimators working over
probabilistic data by utilizing unbiased estimators designed for
standard streams. Applying this approach, they extend a classical data
stream algorithm to obtain a one-pass algorithm for estimating F2, the
second frequency moment. They also present streaming algorithms for
estimating F0, the number of distinct items, and for estimating the
median on probabilistic streams.
- T. S. Jayram, A. McGregor, S. Muthukrishnan, and E. Vee. Estimating statistical aggregates on probabilistic data streams. Principles of Database Systems (PODS), 2007
Streaming numerical linear algebra problems
Ken
Clarkson and David Woodruff found one-pass algorithms for some
approximate matrix operations, including the computation of matrix
products and the estimation of best low-rank projections. These
algorithms maintain small "sketches" of the input matrices; with high
probability, the product of the sketches is close to the true matrix
product. Specifically, the Frobenius norm of the error is proportional
to the product of the Frobenius norms of the input matrices. This upper
bound result sharpens and simplifies prior work on this problem, and
our matching lower bounds show that little improvement is possible.
- K. Clarkson and D. Woodruff. Numerical Linear Algebra in the Streaming Model. STOC, 2009.
Cascaded aggregates on data streams
Cascaded aggregates on data streams.
The problem of computing cascaded (or multi-level) aggregates is quite basic and
is extremely useful in several applications where grouping by some attribute is
involved. For example, to compute the average stock market volatility, we have
to group the stock market data by the stock symbol, compute the variance of the
"logarithmic return on investment" values recorded for that stock, and then
compute the average of these values over all stocks. Other examples from
business intelligence as well as science include analyzing IP traffic, credit
card fraud, low-rank approximation etc. David Woodruff and T.S. Jayram have
obtained algorithms for computing specific types of cascaded aggregates, and in
particular, their algorithm estimates combinations of frequency moments.
- T.S. Jayram and D. Woodruff. The Data Stream Space Complexity of Cascaded Norms. FOCS, 2009 (to appear).
The average case complexity of counting distinct elements
The problem of probabilistically estimating the number of distinct elements in a data stream is one of the central
problems in data stream computing. To estimate this quantity to within a relative error ε, it is known
that if the stream is adversarially ordered, or even randomly ordered, any such algorithm must use
Ω(1/ε^{2}) bits of space. This can be prohibitive in practice. In an attempt to bypass this lower bound,
David Woodruff studied the average-case complexity of the problem when the streaming data comes from
certain distributions. For some natural distributions corresponding to random uncorrelated data, he gives
an algorithm with space complexity that beats the lower bound for adversarial and randomly-ordered streams.
Nevertheless, for other natural distributions, he shows that the Ω(1/ε^{2}) lower bound still holds when
the data is drawn from such distributions.
- D. Woodruff. The average-case complexity of counting distinct elements. ICDT, 2009.
Efficient sketches for Earth-mover distance, with applications
Computing the earth-mover distance is a fundamental problem in
geometric optimization. Together with nearest neighbor search methods,
algorithms for computing it have been applied to fast image search in
large collections of images. In a joint work Alexandr Andoni, Khanh Do
Ba, and Piotr Indyk (all from MIT), David Woodruff provided the first
sub-linear sketching algorithm for estimating the planar earth-mover
distance with a constant approximation. Namely, for points living in
the two-dimensional grid [Δ] x [Δ], an algorithm with space Δ^{ε}
for O(1/ε)-approximation is achieved, for any 0 < ε
< 1. The sketch has immediate applications to the streaming and
nearest neighbor search problems.
- A. Andoni, K. Do Ba, P. Indyk and D. Woodruff. Efficient sketches for Earth-Mover Distance, with applications. FOCS, 2009 (to appear).
Machine learning and optimization
Online learning with prior information
The standard so-called experts algorithms are methods for utilizing a given set of "experts" to make good choices in a sequential
decision-making problem. In the standard setting of experts algorithms,
the decision maker chooses repeatedly in the same "state" based on
information about how the different experts would have performed, if
chosen to be followed. Elad Hazan and Nimrod Megiddo extended this
framework by introducing state information. More precisely, the experts
algorithm may rely on partial information regarding the cost function,
which is revealed to him before choosing an action. This extension is
very natural in prediction problems. For illustration, an experts
algorithm, which is supposed to predict whether the next day will be
rainy, can be extended to predicting the same given the current
temperature. This work introduces new algorithms that attain optimal
performance in the new framework, and apply to more general settings
than variants of regression considered in the statistics literature.
- E. Hazan and N. Megiddo. Online Learning with prior information. Conference on Learning Theory (COLT), 2007.
Adaptive online gradient descent
Peter Bartlett (UC Berkeley), Elad Hazan, and Alexander Rahklin (UC
Berkeley) propose a new online learning algorithm, which is more
efficient than previous algorithms for a general optimization framework
called online convex optimization. The new algorithm, Adaptive Online
Gradient Descent, interpolates between the previous results of
Zinkevich for linear functions and those of Hazan et al. for strongly
convex functions, achieving optimal performance for both extremes and
intermediate cases. Finally, they provide an extension of these results
to general norms.
- P. Bartlett, E. Hazan, and A. Rakhlin. NIPS, 2007.
Computational equivalence of fixed points and no regret algorithms, and convergence to equilibria
Elad Hazan and Satyen Kale (Princeton) studied the relation between
notions of game-theoretic equilibria which are based on stability under
a set of deviations, and empirical equilibria which are reached by
rational players. Rational players are modelled by players using no
regret algorithms, which guarantee that their payoff in the long run is
almost as large as they could have hoped to achieve by consistently
deviating from the algorithm's suggested action. They show that, for a
given set of deviations over the strategy set of a player, it is
possible to efficiently approximate fixed points of a given deviation
if and only if there exist efficient no regret algorithms resistant to
the deviations. Further, they show that if all players use a no regret
algorithm, then the empirical distribution of their plays converges to
an equilibrium.
- S. Kale and E. Hazan. Computational Equivalence of Fixed Points and No-Regret Algorithms, and Convergence to Equilibria. NIPS, 2007
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
Ken Clarkson has previously shown a relationship between coresets, a
concept arising in geometric algorithms, and sparse greedy
approximation, arising in machine learning. These notions are dual, in
the technical sense of convex optimization duality, and can be
understood as instances of the Frank-Wolfe algorithm, a venerable
technique for convex optimization. In this work, he showed that
coresets exist for general optimization problems, with special cases
including the machine learning techniques of boosting, and of support
vector machines. These results in turn allow some sample-complexity
results for these techniques to be sharpened or simplified, and the
practical "core vector machine" algorithm to be simplified and
generalized.
- K. Clarkson. Coresets, Sparse Greedy Approximation, and the Frank-Wolfe Algorithm. Symposium on Discrete Algorithms (SODA), 2008
An efficient algorithm for bandit linear optimization
Jacob Abernethy, Elad Hazan, and Alexander Rakhlin introduced an
efficient algorithm for the problem of online linear optimization in
the bandit setting which achieves the optimal O*(T^{1/2}) regret. The setting is a natural
generalization of the non-stochastic multi-armed bandit problem, and
the existence of an efficient optimal algorithm has been posed as an
open problem in a number of recent papers. They showed how the
difficulties encountered by previous approaches are overcome by the use
of a self-concordant potential function.
- J. Abernethy, E. Hazan, and A. Rakhlin. An Efficient Algorithm for Bandit Linear Optimization. Conference on Learning Theory (COLT), 2008.
Extracting certainty from uncertainty: regret bounded by variation in costs
The problem of prediction from expert advice is fundamental in machine
learning. A major pillar of the field is the existence of learning
algorithms whose average loss approaches that of the best expert in
hindsight (in other words, whose average regret approaches zero).
Traditionally the regret of online algorithms was bounded in terms of
the number of prediction rounds. Cesa-Bianchi, Mansour and Stoltz posed
the question whether it is possible to bound the regret of an online
algorithm by the variation of the observed costs. In this paper Elad
Hazan and Satyen Kalen resolve this question, and prove such bounds in
the fully adversarial setting, in two important online learning
scenarios: prediction from expert advice, and online linear
optimization.
- E. Hazan and S. Kale. Regret Bounded by Variation in Costs. Conference on Learning Theory (COLT), 2008.
Sparse approximate solutions to semidefinite programs
Elad Hazan analyzed a generalization of the Frank-Wolfe algorithm for
approximately maximizing a concave function over the bounded
semi-definite cone which produces sparse solutions (sparsity for SDP
corresponds to low rank matrices).
- E. Hazan. Sparse Approximate Solutions to Semidefinite Programs. LATIN 2008.
Better algorithms for benign bandits
The online multi-armed bandit problem and its generalizations are
repeated decision making problems, where the goal is to select one of
several possible decisions in every round, and incur a cost associated
with the decision, in such a way that the total cost incurred over all
iterations is close to the cost of the best fixed decision in
hindsight. The difference in these costs is known as the regret of the
algorithm. The term bandit refers to the setting where one only obtains
the cost of the decision used in a given iteration and no other
information. Perhaps the most general form of this problem is the
non-stochastic bandit linear optimization problem, where the set of
decisions is a convex set in some Euclidean space, and the cost
functions are linear. Only recently an efficient algorithm attaining
O(sqrt(T)) regret was discovered in this setting. Elad Hazan and Satyen
Kale (Yahoo) propose a new algorithm for the bandit linear optimization
problem which obtains a regret bound of O(sqrt(Q)), where Q is the
total variation in the cost functions. This regret bound, previously
conjectured to hold in the full information case, shows that it is
possible to incur much less regret in a slowly changing environment
even in the bandit setting. The new algorithm is efficient and applies
several new ideas to bandit optimization such as reservoir sampling.
- E. Hazan and S. Kale. Better Algorithms for Benign Bandits. Symposium on Discrete Algorithms (SODA), 2009.
Online learning algorithms for changing environments
The standard way of measuring performance for online learning
algorithms is to look at the regret. The regret compares the
algorithms performance with the best possible fixed decision. This is
a good measure of performance when the underlying input distribution
is fixed or slowly changing with time. When the underlying
distribution makes sudden changes, an algorithm could be performing
badly but still give low regret. Elad Hazan and C. Seshadhri define a
stronger performance measure called adaptive regret and design
algorithms that give much better guarantees than old online learning
algorithms. They use a bootstrapping procedure that takes old learning
algorithms and outputs a much stronger algorithm. Using some streaming
ideas, this conversion can be done efficiently.
- E. Hazan and C. Seshadhri. Efficient learning algorithms for changing environments. ICML 2009.
Geometric algorithms
Tighter bounds for random projections of manifolds
The Johnson-Lindenstrauss random projection lemma gives a simple way to reduce
the dimensionality of a set of points while approximately preserving their
pairwise distances. The most direct application of the lemma applies to a finite
set of points, but recent work has extended the technique to affine subspaces,
curves, and general smooth manifolds. Ken Clarkson considered the case of random
projection of smooth manifolds and sharpened a previous analysis, reducing the
dependence on such properties as the manifold's maximum curvature.
- K. Clarkson. Tighter Bounds for Random Projections of Manifolds. Symposium on Computational Geometry (SoCG) 2008.
Self-improving algorithms for Delaunay triangulations
Ken Clarkson and summer intern Seshadhri Comandur studied the problem of
two-dimensional Delaunay triangulation in the self-improving algorithms model,
and devised an optimal algorithm in this setting. Here the n points of the input
are assumed to each come from an independent, unknown, and arbitrary
distribution. The first phase of their algorithm builds data structures that
store relevant information about the input distribution. The second phase uses
these data structures to efficiently compute the Delaunay triangulation of the
input, with a running time that matches the information-theoretic
lower bound for the output distribution. This implies that if the output
distribution has low entropy, then their algorithm beats the standard Omega(n
log n) bound for computing Delaunay triangulations.
- K. Clarkson and C. Seshadhri. Self-Improving Algorithms for Delaunay Triangulations. Symposium on Computational Geometry (SoCG), 2008.
On the multi-cover problem in geometric settings
Working with Chandra Chekuri and Sariel Har-Peled of the University of Illinois, Ken Clarkson
extended algorithms for geometric set cover to the case of geometric multi-cover. Given a set of points
P and a collection of geometric shapes F, the problem is to find a minimum cardinality subset of F such
that every point in P is covered by (contained in) at least D(p) sets. Here D(p) is an integer demand
requirement for p. When the demands D(p)=1 for all p, this the standard set cover problem. The set cover
problem in geometric settings admits a polynomial-time approximation algorithm with an approximation
ratio that is better than that for the general version. This paper shows that similar improvements can be
obtained for the multi-cover problem as well. In particular, they obtain an O(log Opt) approximation for set
systems of bounded VC dimension, and an O(1) approximation for covering points by half-spaces in three
dimensions.
- C. Chekuri, K. Clarkson, and S. Har-Peled. On the set multi-cover problem in geometric settings. Symposium on Computational Geometry (SoCG), 2009.
Ranking and aggregation algorithms
Comparing top k lists.
Motivated by several applications, Ronald Fagin, Ravi Kumar, and D.
Sivakumar have introduced several distance
measures between two top k lists.
Some of these measures are metrics, while others are not.
For each of these latter distance measures, they are almost
metrics in the following two aspects---it satisfies a relaxed version
of the polygonal inequality and it is multiplicatively approximated by
a metric. They show that these two notions of almost metrics are
equivalent.
Rank aggregation and similarity search.
The problems of similarity search and
classification are ubiquitous in various fields---information
retrieval, pattern recognition, etc. Ronald Fagin, Ravi Kumar, and D.
Sivakumar have proposed a novel approach that is based on rank
aggregation to performing efficient similarity search and
classification in high dimensional data. This method is also
database friendly in that it accesses data primarily in a pre-defined
order without random accesses, and, unlike other methods for
approximate nearest neighbors, requires almost no extra storage.
- R. Fagin, R. Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. SIGMOD, 2003.
Partial rankings.
Ronald Fagin, Mohammad Mahdian,
Ravi Kumar, D. Sivakumar, and Erik Vee provide a comprehensive picture
of how to compare partial rankings. They propose several metrics to
compare partial rankings, present algorithms that efficiently compute
them, and prove that they are within constant multiples of each other.
Based on these concepts, they formulate aggregation problems for
partial rankings, and develop a highly efficient algorithm to compute
the top few elements of a near-optimal aggregation of multiple partial
rankings.
- R. Fagin, M. Mahdian, R. Kumar, D. Sivakumar and E. Vee. Comparing and aggregating rankings with ties. PODS, 2004.
Sorting and selection with imprecise comparisons
In this work sorting and
selection are considered in the the following model of imprecise
comparisons:
there exists some δ>0 such that when two elements are compared, if
the
values of those elements differ by at least δ, then the comparison will
be
made correctly; when the two elements have values that are within δ,
the
outcome of the comparison is unpredictable. In this model, the standard
method of paired comparisons (Thurstone, 1927)
minimizes the errors introduced by the imprecise comparisons at the
cost of
n(n-1)/2 comparisons. Miklos Ajtai, Vitaly Feldman, Avinatan Hassidim
(MIT)
and Jelani Nelson showed that the same optimal guarantees can be
achieved using
4n^{3/2} comparisons, and proved the optimality of their method. They
then
explored the general trade-off between the guarantees on the error that
can be
made and number of comparisons for the problems of sorting,
max-finding, and
selection. Their results provide close-to-optimal solutions for each of
these
problems.
- M. Ajtai, V. Feldman, A. Hassidim and J. Nelson. Sorting and Selection with Imprecise Comparisons. International Colloquium on Automata, Languages and Programming (ICALP), A. 2009.
Algorithms on graphs
Transitive-closure spanners
Given a directed graph G
and an integer k, a k-transitive-closure spanner (k-TC-spanner) of G is
a directed graph H that has (1) the same transitive closure of G, and
(2) diameter at most k. These spanners were implicitly studied in
access control, data structures, and property testing, and properties
of these spanners have been rediscovered over the span of 20 years. The
main goal in each of these applications is to obtain the sparsest
k-TC-spanners. Together with Arnab Bhattacharyya, Elena Grigorescu,
Kyomin Jung, and Sofya Raskhodnikova, David Woodruff resolved several
open questions concerning the size of sparsest k-TC-spanners and the
complexity of computing a k-TC-spanner for a given input graph. As a
result, the work greatly improves the query complexity of monotonicity
testers of H-minor free graphs, as well as space/time tradeoffs in
various access control schemes. The work also gives the first sublinear
approximation algorithm for the more general problem of computing a
directed k-spanner, thereby resolving the main open question of Elkin
and Peleg (IPCO, 2001).
- A. Bhattacharyya, E. Grigorescu, K. Jung, S. Raskhodnikova, and D. Woodruff. Transitive-closure spanners. Symposium on Discrete Algorithms (SODA), 2009.
Regularity lemmas and combinatorial algorithms.
Nikhil Bansal (IBM Watson) and Ryan Williams present new combinatorial algorithms for Boolean matrix
multiplication (BMM) and preprocessing a graph to answer independent
set queries. We give the first asymptotic improvements on combinatorial
algorithms for dense BMM in many years, improving on the "Four
Russians" O(n^3/(w log n)) bound for machine models with wordsize w.
(For a pointer machine, we can set w = log n.) The algorithms utilize
notions from Regularity Lemmas for graphs in a novel way. They give two randomized combinatorial algorithms for BMM. The first algorithm is essentially a reduction from BMM to the Triangle Removal Lemma. The best known bounds for the Triangle Removal Lemma only imply an O((n log b(n))/(b(n) w log n)) time algorithm for BMM where b(n) = (log* n)^d for some d > 0, but improvements on the Triangle Removal Lemma would yield corresponding runtime improvements. The second algorithm applies the Weak Regularity Lemma of Frieze and Kannan along with several information compression ideas, running in O(n^3(log log n)^2/(log n)^{9/4}) time with probability exponentially close to 1. When w > log n, it can be implemented in O(n^3(log log n)^2/(w log n)^{7/6}) time. Our results immediately imply improved combinatorial methods for CFG parsing, detecting triangle-freeness, and transitive closure. Using Weak Regularity, they also give an algorithm for answering queries of the form "is S in V an independent set?" in a graph. Improving on prior work, they show how to randomly preprocess a graph in O(n^{2+eps}) time (for all eps > 0) so that with high probability, all subsequent batches of log n independent set queries can be answered deterministically in O(n^2(log log n)^2/((log n)^{5/4}) time. When w > log n, w queries can be answered in O(n^2(log log n)^2/((log n)^{7/6}) time. In addition to its nice applications, this problem is interesting in that it is not known how to do better than O(n^2) using "algebraic" methods.
- N. Bansal and R. Williams. Regularity Lemmas and Combinatorial Algorithms. FOCS, 2009 (to appear).