IBM®
Skip to main content
    United States [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Tesla: On-Demand Information Systems

Computer Science


 Overview

Enterprise information systems increasingly involve a variety of data sources, ranging from application programs, to Web Services, content management systems, and relational DBMSs. As these systems expand in scale and complexity, their component data and compute resources are increasingly autonomous, independently managed entities. The complex and varied nature of these resources makes it difficult to set up, monitor and manage this infrastructure.

Today, application programmers deal with the dynamics of such distributed systems by making two important restrictions:

  • Scale-down data sources through consolidation: First, large numbers of diverse data sources are consolidated to a small number of homogeneous, centrally administered data sources through various replication and warehousing mechanisms.
  • Dedicate and over-provision resources: Second, resource-unpredictability is addressed by over-provisioning for the peak system load, and dedicating separate resources to each application.

The Tesla project's goal is to avoid both these restrictions. We want to encourage enterprise applications to share a common distributed information and computation infrastructure, without sacrificing predictable behavior. We want to allow applications to access truly independent data sources that resist consolidation. We believe that avoiding these restrictions will not only reduce the total cost of ownership of existing applications, but also enable new applications that access hard-to-consolidate data sources.

Specifically, we are building the next generation of data management middleware that will allow applications to flexibly exploit large scale, distributed, autonomous data and compute resources. By flexibly, we mean that application programs should be masked from any brittleness in individual components of the information system, and instead see a single virtual system image with predictable QoS. The Tesla middleware will dynamically discover and provision compute resources on demand to meet performance goals, and access data sources on demand to meet the quality requirements of application queries.

Research directions

  • Scalable, Flexible, Adaptive Information Integration: The first feature of an on-demand information system is integrated access to distributed, autonomous data sources. This involves content-based access [1], distributed indexing [2], new failure semantics to mask system brittleness [3], and continual monitoring of data source properties [4]. We are tackling these issuses through four sub-projects.
    1. Metawrapper for Dynamic Federation of Distributed Data Sources
    2. Autonomic Index Re-Organization and Index Evolution
    3. Partial Results, and Failure Transparency for Queries and Updates: Traditionally, data management systems have used a very strong notion of correctness that is based on providing exact, accurate answers. While this semantics is clean and easy to understand, it makes the distributed system brittle, because its performance (availability) are dictated by its slowest (most unreliable) component. In Tesla we are investigating various looser correctness semantics that are based on partial or approximate results. In addition, we are also investigating semantics and implementation strategies for "partial updates" -- making the failure of a data source transparent to an update application.
    4. Query Cost Calibrator
  • Predictable Quality of Service (QoS): A second key feature of an on-demand information system is that it autonomically re-configures and optimizes itself so as to provide predictable performance and availability for enterprise applications. Currently, we are tackling this problem through an overall system planner ([1]) that uses various advisors (such as [2], [3]) to decide which are the data hotspots, and where they can be migrated, so as to meet the QoS goals. As a sub-goal, this also involves developing a replication service that provides predictable end-to-end latency ([4]). We are building four components to tackle the QoS challenge:
    1. Information Lifecycle Manager / Planner
    2. Data Placement Advisor
    3. xg: Grid Scheduler
    4. Autonomic Replication Management System
    5. Graceful Scaleout of Query Processing

Use Cases / Demos

  • QOS-based source selection demo
  • Active Integration of Complex Claim Adjudication Processes - Insurance Sector

Publications

  • Tesla: An On-Demand Information System. Unpublished manuscript, available on request
  • Autonomic Index Evolution. Unpublished manuscript, available on request. Also patent application ARC9-2004-0043, 2004.
  • Dynamic and Selective data source binding through a metawrapper. Patent application ARC9-2004-0041, 2004.
  • Intra-Fragment Paralellism for Graceful Scaleout of Query Processing. Unpublished manuscript, available on request.
  • Towards an Information Infrastructure for the Grid. S. Bourbonnais, V. M. Gogate, L. M. Haas, R. W. Horman, S. Malaika, I. Narang, and V. Raman. IBM Systems Journal, 2004
  • Tesla: An On Demand Information System. V. Raman and I. Narang. IBM Data Management Potpourri Workshop, Toronto, 2004.
  • Load and Network Aware Query Routing for Information Integration . Wen-Syan Li, Vishal S. Batra, Vijayshankar Raman, Wei Han, K. Selšuk Candan, Inderpal Narang. IEEE International Conference on Data Engineering (ICDE), 2005.
  • Fast Replication without two-phase commits. Patent application ARC9-2003-0007, 2003. Data Access and Management Services on Grid. V. Raman, I. Narang, C. Crone, L. Haas, S. Malaika, T. Mukai, D. Wolfson, and C. Baru).
  • Database Access and Integration Services Working Group, Global Grid Forum (GGF) 5, 2002.
 Related Projects

    About IBMPrivacyContact