IBM®
Skip to main content
    United States [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

(none)

Activity Report


  Activity Report, July - December 2004
clear image

Highlights

Blog Analysis
Blogs (aka weblogs) have been the most talked-about web phenomenon in the last two years. Ravi Kumar, Jasmine Novak, Prabhakar Raghavan (Verity) and Andrew Tomkins focused on the social structure created by weblogs by analyzing over a million bloggers from LiveJournal.com and studying their age, interests, and friendship patterns. They also studied the activity patterns of over 25,000 blogs, focusing on the link creation process inside blog postings. They reached several interesting and important conclusions. For example, while three out of four bloggers are between the ages of 16 and 24, nearly half of all their friendships are explained by common interests. This article was invited to appear in the December 2004 issue of CACM devoted to weblogs.

Hardness of Approximating Multicut and Sparsest-Cut
The multicut problem is a classical optimization problem on graphs and has several important applications in combinatorial optimization, including network reliability and fault-tolerance, multicommodity flows, etc. In the simplest version, this problem asks to compute the minimum number of edges that must be removed from a graph to disconnect a given set of source--sink pairs. This problem is NP-hard, known to be efficiently approximable to within an O(log n) factor, and it is an open question if there are better approximation algorithms. Shuchi Chawla (CMU), Robert Krauthgamer, Ravi Kumar, Yuval Rabani (Technion), and D. Sivakumar show that the Unique Games Conjecture of Khot, implies that this problem cannot be approximated within any constant factor, and that a quantitatively stronger version of the conjecture implies that even an Omega(log log n) factor is NP-hard. They prove similar hardness results for another well-studied problem, namely the sparsest-cut problem, for which breakthrough recent results established a sublogarithmic approximation factor.

Communication Complexity Lower Bound Methods
Communication complexity is a fundamental unifying framework across various disciplines of computer science. In its most basic incarnation, it poses the problem of how much communication is required between two parties, Alice and Bob, who, respectively, hold inputs x and y for a binary Boolean function f(x,y). Progress on communication complexity is linked to various foundational questions in Computer Science, including circuit complexity, massive data set computations via streaming algorithms, data structure complexity, time-space trade-offs, feasible auctions, etc. Ravi Kumar and D. Sivakumar discover a novel approach to study communication complexity based on graph theory, and establish connections to a very well-studied combinatorial optimization quantities such as chromatic number and the Lovasz theta function.

Source Discovery
The Source Discovery system for building, viewing, and navigating overviews of web sites remains one of the core components of the WebFountain service offering. It is a key product differentiator for WebFountain. It is being offered as part of several government contracts, and has been integrated into the Tethys demonstration system. David Gibson has been responsible for the design, implementation and maintenance of the system, which has now been integrated into the WebFountain production cluster management system, so that it can be managed by the system operators. Papers describing this system were accepted to the 15th ACM Conference on Hypertext and Hypermedia (Hypertext 2004) and to the 13th World-Wide Web Conference (WWW2004).

Templates
Template material is present across multiple web pages and used primarily for formatting, navigation, and branding. David Gibson, Kunal Punera (UT Austin), and Andrew Tomkins studied the nature, evolution, and prevalence of these templates on the web, using new and faster algorithms. Their results show that 40% of the content on the web is template content, and that this ratio is increasing significantly. This has great impacts on the performance of information retrieval and ranking, classification, and link analysis.

Awards and Honors

D. Gibson, Outstanding Technical Achievement Award for Source Discovery, October 2004.

N. Megiddo, Outstanding Innovation Award for Adaptive Cache Replacement, December 2004. (co-winner: D. Modha)

S. Rajagopalan, Outstanding Innovation Award for the Trevi project, December 2004.

S. Rajagopalan, Research Division Award for JSR 170, December 2004.

Publications and Conferences

A. Atserias, P. G. Kolaitis, and M. Y. Vardi, Constraint Propagation as a Proof System, Proc. 10th International Conference on Principles and Practice of Constraint Programming (CP 2004), LNCS 3258, Springer, pp. 77-91, 2004.

Z. Bar-Yossef, T. S. Jayram, R. Krauthgamer and R. Kumar, Approximating edit distance efficiently, Proc. 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 550-559, 2004.

Z. Bar-Yossef, T. S. Jayram, R. Krauthgamer and R. Kumar, The sketching complexity of pattern matching, Proc. 8th International Workshop on Randomization and Computation (RANDOM), pp. 261-272, 2004.

D. de Farias and N. Megiddo, Exploration-Exploitation Tradeoffs for Experts Algorithms in Reactive Environments, Proc. 18th Annual Conference on Neural Information Processing Systems (NIPS), 2004.

C. Faloutsos, K. McCurley and A. Tomkins, Fast Discovery of Connection Subgraphs, Proc. 10th ACM Conference on Knowledge Discovery and Data Mining (KDD), pp. 118-127, August 2004.

D. Gibson, The Site Browser: Catalyzing Improvements in Hypertext Organization, Proc. 15th ACM Conference on Hypertext and Hypermedia (Hypertext), pp. 68-76, 2004.

D. Gibson, Surfing the Web by Site, Proc. 13th World-Wide Web Conference (WWW), Alternate Track Papers & Posters, pp. 496-497, 2004.

R. Guha, R. McCool and R. Fikes, Contexts for the Semantic Web, 3rd International Semantic Web Conference (ISWC), 2004.

T. S. Jayram, S. Khot, R. Kumar, and Y. Rabani, Cell-probe lower bounds for the partial match problem, Journal of Computer and System Sciences, 69(3):435-447, 2004.

L.M. Kirousis and P.G. Kolaitis, A Dichotomy in the Complexity of Propositional Circumscription, Theory of Computing Systems, 37(6), 695-715, 2004.

R. Krauthgamer and J. R. Lee, The black-box complexity of nearest neighbor search, Proc. 31st International Colloquium on Automata, Languages and Programming (ICALP), pp. 858-869, 2004.

R. Krauthgamer, J. R. Lee, M. Mendel and A. Naor, Measured descent: A new embedding method for finite metrics, Proc. 45th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 434-443, 2004.

R. Kumar, U. Mahadevan, and D. Sivakumar, A graph-theoretic approach to extract storylines from search results, Proc. 10th ACM Conference on Knowledge Discovery and Data Mining (KDD), pp. 216-225, August 2004.

R. Kumar, J. Novak, P. Raghavan, and A. Tomkins, Structure and evolution of Blogspace, Communications of the ACM, December 2004.

Invited Talks

  • P. Kolaitis
    "Data Exchange: Aspects of Logic and Complexity",
    Invited talk, 2004 Workshop on Logic and Computational Complexity, Turku, Finland, July 2004.
    "Data Exchange and Composition of Schema Mappings",
    ARISE Workshop on Exchange and Integration of Data, Toronto, Canada, October 2004.
  • R. Krauthgamer
    "A Metric Notion of Dimension and its Algorithmic Applications",
    Bay Area Theory Symposium (BATS), UC Berkeley, Berkeley CA, December 2004.
  • E. Vee
    "Flood Illumination of Infinite Wedges",
    14th Annual Fall Workshop on Computational Geometry, MIT, Cambridge, MA, November 2004.
    "An Overview of Time-Space Tradeoffs",
    Bay Area Theory Symposium (BATS), UC Berkeley, Berkeley CA, December 2004.

Presentations and Seminars

  • M. Ajtai
    "Secure Computing",
    Tribute to Larry Stockmeyer Workshop, ARC, October 2004
  • R. Fagin
    "Comparing and aggregating rankings with ties",
    TOCTalk, ARC, August 2004
    "Easier ways to win logical games",
    Tribute to Larry Stockmeyer Workshop, ARC, October 2004
  • R. Krauthgamer
    "On the hardness of approximating multicut and sparsest cut",
    UC Berkeley, Berkeley, CA, October 2004.
    "Approximating Edit Distance Efficiently",
    UC Bekeley, Berkeley, CA, December 2004.
    "A Metric Notion of Dimension and its Algorithmic Applications",
    MIT, Cambridge, MA, December 2004 and Haifa University, Haifa, Israel, December 2004.
  • R. Kumar
    "A CS perspective on voting",
    Workshop on Algorithms for Information Networks, Bertinoro, October 2004

Committee Work

  • R. Fagin
    Program committee member, Search track of World-Wide Web Conference (WWW 2004)
    Program committee member, Flexible Query Answering Systems 2004
  • P. Kolaitis
    General Chair, IEEE Symposium on Logic in Computer Science (LICS)
  • R. Guha
    Deputy vice chair, Semantic Web track, 14th International World Wide Web Conference (WWW),2005
  • N. Meggido
    Program committee member, 21st International Conference on Machine Learning (ICML), 2004
    Cluster chair, INFORMS national fall meeting
    INFORMS Frederick W. Lanchester Prize committee

Editorships


External Positions


In Memory of Larry Stockmeyer

Larry Stockmeyer, Research Staff Member Emeritus, passed away on July 31, 2004. Larry was one of the pioneers of computational complexity and had spent his entire professional career at IBM Research, first at the IBM T.J. Watson Research Center and then at the IBM Almaden Research Center.

In commemoration of Larry's contributions to the field, a one-day workshop was held at Almaden on October 25, 2004. Talks were given by Miklos Ajtai (IBM Almaden), Cynthia Dwork (Microsoft Research), Ronald Fagin (IBM Almaden), Maria Klawe (Princeton University), Nick Pippenger (Princeton University), and Moshe Y. Vardi (Rice University).

Personnel

Short-Term Visitors
  • Ravi Sundaram (Nov 8-Nov 15, 2004)
  • Seffi Naor (Aug 30-Sep 10, 2004)
  • Amit Chakrabarti (Aug 9-Aug 20, 2004)
  • Leonard Schulman (Aug 9-Aug 12, 2004)
  • Daniela Pucci de Farias (Aug 2-Aug 9, 2004)
  • Moshe Y. Vardi (Jul 26-Jul 30, 2004)
  • Yuval Rabani (Jun 28-Jul 16, 2004)
Transitions
  • Ziv Bar-Yossef is on a leave to work at the Electrical Engineering Department at the Technion.
  • David Gibson is on leave to work with the WebFountain function at ARC.
  • Sridhar Rajagopalan is on leave to work in the Content Management Department at ARC.
  • Erik Vee, a graduate of University of Washington, joined the department as a postdoc in September 2004.

Organization

Phokion Kolaitis (Senior Manager) CS Principles and Methodologies
Cheryl Rothrock (Administrative Assistant)
Sridhar Rajagopalan (On Leave at the Content Management Department)
Larry Stockmeyer (Emeritus)
Andrew Tomkins (Manager) Information Management Principles
Ziv Bar-Yossef (On Leave at the Technion)
David Gibson (On Leave at the WebFountain function)
Ramanathan Guha
Ravi Kumar
Kevin McCurley
Ron Fagin (Manager) Foundations of Computer Science
Miki Ajtai
T. S. Jayram
Robert Krauthgamer
Nimrod Megiddo
D. Sivakumar
Erik Vee

    About IBMPrivacyContact