Skip to main content
    United States [change]    Terms of use
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Schema Mapping Management System

Computer Science


Enterprise databases cover hundreds of tables with thousands of attributes in complex and disparate structures. As many cover the same domain, the need to integrate them for more insight and a broader scope is apparent. To overcome structural heterogeneities, users must define mappings from one or more source schemas to a target schema.

Our schema mapping management system, Clio, is a semiautomatic tool that helps users define such mappings. Clio then also interprets these mappings to construct a set of database queries that transform and integrate source data to conform to the target schema. Such queries can be used to populate data warehouses or to define views and virtual tables in federated database environments. Source and target can be any combination of relational databases and XML data.


  • XML -> XML mapping (XML views on Web data, ...)
  • Relational -> XML mapping (Web-publishing of legacy data,...)
  • XML -> Relational mapping (XML shredding into RDBMS, ...)
  • Relational -> Relational mapping (Data warehousing, ...)
  • Full usage of source and target data constraints
  • Discovery of data constraints
  • User friendly interface
  • Dynamic interpretation of user input
  • Incremental generation of transformation queries
  • Intelligent suggestions of likely correspondences
  • Merging source collections with automatic join-path selection
  • Splitting source collections with ID invention

Key components:

  • Schema Viewer - The Schema Viewer allows users to draw arrows between source and target schema elements. Such arrows may cross nesting levels, combine multiple elements, split and merge tables, etc. Clio incrementally interprets these arrows as mappings and generates a query accordingly.

  • Attribute Matcher - The attribute-matcher component automatically suggests likely mappings by analyzing the schemas and the underlying data. Our Na´ve-Bayes-based matching algorithm has very high success rates, helping the user discover unfamiliar source schemata.

  • Transformation Query - Depending on the source type, Clio generates SQL queries, or XQuery and XSLT transformation queries. These queries a) Produce appropriate grouping, b) Generate Ids where necessary, and c) Produce proper target nesting.

    About IBMPrivacyContact