George Mason University
Process Improvement



Field report

What are we asking you to do?

To integrate two databases at your work.  Each database must contain at least 4 records.  You need to find and select the databases.  You need to get permission for working with the databases and you need to create a third database that contains both set of data and that is compliant in both semantic and syntactic form with intended use of the new data. 

Why we ask you to do this?

This activity will demonstrate your knowledge of the course material and your ability to apply it to actual situations at work.

How should you do it?

To complete the project you will need to complete the following steps:

  1. Make an XML map of each database
  2. Transform the maps into a single database
  3. Produce a report showing how and which items of data were transformed
  4. Present your work to class

Structure of the report

Your report should have the following sections:

  1. Names of members of the group
  2. Permission to display your project on the web for other students (grade is not affected by this)
  3. The problem statement
    • Text introduction to each database
    • XML capture of the database
    • XSD for each database
  4. The solution
    • Statement of issues in combining fields
      • Code for XSLT transformations
      • Discuss how you used the XSD to create the XSLT
    • Filtering so same cases have the same record number
      • Discuss your choice of identifiers for the filtering process
      • Document one case as shown in lecture (give Posterior Odds, Likelihood Ratios for 5 training dataset records).
    • Load the transform data set into target data set
    • Resulting transformed data set
  5. Summary and conclusions (answer following questions)
    • What was surprising in integrating the data in your case?
    • What did you learn from the exercise of combining data from these two databases?
    • Is data integration important for your institution?
    • Is the process described in the course practical?
    • Could you do this process on a large database?
    • What do you think about viability of data integration at your institution?
    • How confident are you that you can manage or contract out a large database integration process?
  6. How could the course be improved for next group of students?


