Healthcare Databases & Information Systems Course


Topic: Repeated SQL & Query Optimization

Let’s connect the dots about Tracey and Super Computer.  In this situation, Tracey went to the doctor because she was being given medications that were just not working.  The computer arrived and promised that he could save the day. By mining data, the Computer investigated Tracey’s health problems and cross-referenced it to other cases with same problems.  It then found what worked among these cases in the data. Of course, no case in the data is exactly like Tracey. The computer didn’t need to locate a patient that has 100% exactly like Tracey.  No cases matches all the relevant features of the patient. The solution is to find cases that nearly match the patient and then model what it would look like if all features were matched. Then these models can be used to forecast patients' reaction to the medications.  The computer determines Tracey’s likely success based on experience of similar cases with various mediations! And the best part? Tracey doesn’t have to be a guinea pig! She can simply take the medication and know that it will make her feel better.

Learning Module Objectives:

After completing the activities in this section, you should be able to:

  • Repeatedly execute SQL
  • Analyze data one row at a time
  • Use do-while command
  • Find cases that fit subset of patient's features.
  • Improve computational speed through indexing

Learning Material

  1. Cursor and do-while commands Slides► Video►
  2. Query Optimization Slides► Video►

Teach One Assignment

If you are supposed to teach about this section of the course, select one of the assignments, do the assignment and show it to the instructor to make sure you have done it correctly.  Prepare your slides, narrate your slides, remove excess words from narrated slides, convert narrated slides to a file format that can be uploaded,  upload your file, email everyone in the class the URL of your file.  Make sure that all these tasks are done ahead of scheduled class session.  Your peers will appreciate receiving your advice on how to solve a class assignment as soon as possible and well before last day prior to class session. 

Individual Assignments

No individual assignment should be completed in teams.  Submit your work in Blackboard.  Do not discuss the work with other students. 

Question Related to Cursor and Do-While Commands:  Refer to the patient 457631 that was analyzed in a previous assignment.  This patient has been hospitalized at age 58.08 with the 5 diagnoses IV17.3, I401.9, I278.00, I786.05, I272.4, calculate how many patients show in the data with 2, 3, 4, and 5 combinations of these diagnoses.  Use a do-while command to go through the various combination of these diagnoses. Calculate the odds of mortality for each of these combination of diagnoses.  Data►

Question Related to Query Optimization:  Use indexes defined on id, icd9, and AgeAtDx to make the code that you and teammates had prepared for predicting mortality.  Report the computational efficiency of the code before and after indexing.  Data►

Ongoing Question: This question is an individual project.  Each week a portion of the work is done.  This week calculate accuracy of predicting diabetes using diagnoses, age and gender as predictors. 

  1. Download data  Video► Data► SQL► Slides► Screenshots►
  2. Clean the data, as you had done before for predicting mortality of patients
  3. Use the predictive model for diabetes that you previously developed. 
  4. Calculate the accuracy of predictions of diabetes in the validation data set.  Create a receiver operating curve 

Team Assignment

There are no team assignments this week.

Copyright © Farrokh Alemi, Ph.D. First created on January 9th 2005. Most recent revision 08/29/2018. This page is part of the course on Healthcare Databases.