Healthcare Databases & Information Systems Course


Topic: Normalization & Entity Relationships

Learning Module Objectives:

After completing the activities in this section, you should be able to:

  • Examine information requirements of an organization
  • Decide on entities that should be modeled in the database
  • Create normalized tables inside a database

Learning Material

  1. Assess information needs Read►
  2. Decide on entity relationships Read► More► Q&A►
  3. Describe relationships as primary and foreign keys Slides► Video►
  4. Create normalized tables Read► Slides► Video►

Teach One Assignment

If you are supposed to teach about this section of the course, select one of the assignments, do the assignment and show it to the instructor to make sure you have done it correctly.  Prepare your slides, narrate your slides, remove excess words from narrated slides, convert narrated slides to a file format that can be uploaded,  upload your file, email everyone in the class the URL of your file.  Make sure that all these tasks are done ahead of scheduled class session.  Your peers will appreciate receiving your advice on how to solve a class assignment as soon as possible and well before last day prior to class session. 

Individual Assignments

No individual assignment should be completed in teams.  Submit your work in Blackboard.  Do not discuss the work with other students. 

Question on ER Diagrams:  Create a many to many relationship between patient table (Patient_Id, first name, Last name) and address table (Address_Id, Apartment, Street, Number, Zip code).  Enter 5 rows of data into each table showing that two individuals can live at the same address and one individual can have two addresses, one current and the other previous address.  Draw the diagram.  Microsoft Access has an easy way to do the drawing or you can use any other software.  Submit the drawing and all 3 tables with at least 5 rows of data entered. 

Questions on Normalization:  Please answer the following 7 questions:

Question 1:  Good database design requires:

a.      Present data in the sequence collected so that it is convenient to use it.

b.      Non-key fields should be facts about the foreign key.

c.     Primary key should uniquely identify the individual occurrences of the entity or relationship

d.      A&B

e.       None of the above.


Question 2:  When a table is in First Normal Form:

a.       Fields consist of atomic data

b.      Fields are repeated in a row.

c.       A&B

d.      Both A&B

e.       None of the above


Question 3:  If knowing Physician Name is enough for knowing Physician ID, then which one of the following statements is true:

  1. Physician Name is functionally dependent on Physician ID

  2. Both are functionally independent of each other

  3. Physician ID is functionally dependent on Physician Name

  4. Physician ID is the primary key

  5. A&D


Question 4:  If the field, ‘Employee_Name’ is fully functionally dependent on the field ‘Employee_Number’ then which one of the following statements is not correct:

  1. Given an employee name we can find the employee number

  2. Given an employee number we can find the employee name

  3. A only

  4. A&B

  5. Neither A or B is correct.

Question 5:  When we talk about tables in Second Normal Form, we can say that:

  1. A foreign key is always the same as the primary key.

  2. The primary key is the minimum subset necessary to determine each fact in the table.

  3. The last two fields will always depend on the concatenated primary key.

  4. B&C

  5. All of the above.


Question 6:  Which of the following are good rules of design for normalization:


  1. Each table should refer to multiple entities

  2. Composite or repeating fields should always be used.

  3. The primary key should uniquely identify the entity.

  4. A&B

  5. All of the above


Question 7:  A good database designer might not put the field ‘Visit_Date’ in the Patient table because:

  1. Date fields are never put in a Patient table.

  2. The field Visit_Date might not depend on the primary key of the Patient table.

  3. The field Visit_Date will expire after 7 days, leaving orphan data in the database.

  4. C only

  5. A&B only

Ongoing Question:  This question is an individual project.  Each week a portion of the work is done.  This week explore various components of the predictive model for diabetes.  Make separate predictions for detecting undiagnosed diabetes and forecasting upcoming diabetes. 

  1. Download data  Video► Data► SQL► Slides► Screenshots►
  2. Clean the data, as you had done before for predicting mortality of patients
  3. Use the model that you created last week for predicting diabetes from diagnoses and age of the patient. 
  4. identify the top 10 diagnoses that are most predictive
  5. Identify a case with 5 diagnoses and report by hand how the risk factors are scored to produce the overall probability of diabetes
  6. In the validation data set, calculate what percent of patients are undiagnosed diabetics.  To accomplish this task, use only likelihood ratios for diagnoses that in the training set occur after diabetes.  
  7. In the validation data set, calculate what percent of patients will develop diabetes in 6 months.    To accomplish this task, use only likelihood ratios for diagnoses that in the training set occur prior to diabetes. 

Team Assignment

There are no team assignments this week.

Copyright © Farrokh Alemi, Ph.D. First created on January 9th 2005. Most recent revision 08/29/2018. This page is part of the course on Healthcare Databases.