IBM   Almaden Computer ScienceAbout AlmadenComputer SciencePressCareersHome
CS home
  About us
    XML Demo
  Ease of use
patent server

Mapping XML Schemas - Demo
Previous  Page 1  2  3  4 Next

I. Loading the schemas

Screen Capture 1
Figure 1: Source and Target schemas after being loaded into Clio. (Click to enlarge)

In this example, we will use a relational schema as our source schema and an XML Schema as target schema.  The relational source encodes information about undergraduate and graduate students in three tables.  A single table (gradEnrolls) contains all the information about graduate students - their ID (sid), name, the courses they are enrolled (identified by the cid attribute), and the grade obtained on that course.  Two tables are used to maintain the same information on undergraduate students (underGrads and enrolls).  To determine what courses an undergraduate student is enrolled in, we need to join these two tables over the sid (student ID) field.  The target schema, on the other hand, has no separation between graduate and undergraduate students.  The main collection Enrollment, is a set of Student records. For each such Student, a nested set of Courses is kept with a course IDs (cid) and a pointer to the information about the grade for that course (represented with eid).  The grade for each course enrolled by a student will be stored in the Evaluations collection.  The eid under course is, thus, a foreign key, to the eid under Evaluations. We need to accomplish two major tasks when mapping these two schemas.  First, we must merge the undergraduate and graduate student information from the source schema under one collection (Enrollment) on the target schema.  Then, for both graduate and undergraduate students on the source side, we must split the course enrollment and grade information into two collections at the target side, keeping the semantic relationship between the values intact (e.g., if student "Lucian" was enrolled in "CS101" and received an "A", we need to be able to reconstruct this information after it is split into three (sometimes nested-) collections on the target).

Figure 1 shows the status of Clio after the source and target schema information is loaded.  The information on the left panel of Figure 1 represents a source relational schema named db.  The right panel is a representation of an XML Schema (named targetDB). Both schemas are available below:

Previous  Page 1  2  3  4 Next

Almaden Home | IBM Research | Legal | Feedback