Georgetown University


Benchmarking & Clinician Profiles

Assigned Reading

Use data balancing to benchmark clinicians (use instructor's last name as password) Read►



Submit one file for all questions.  Include all charts, code, and output in the same file.  Start each question in a separate page or sheet. Include in the first page a summary page.  In the summary page write statements comparing your work to answers given or videos.  For example, "I got the same answers as the Teach One video for question 1." 

Question 1: In the following question, use SQL to analyze the data.  Assume that we have followed two clinicians, Smith and Jones, and constructed the decision trees in Figure 1. Data► Atoosa's Excel► Pooja's Video► SQL Code►

Figure 1:  Practice Patterns of Dr. Jones and Smith

  • What is the expected length of stay for each of the clinicians?
  • What is the expected length of stay for Dr. Smith if he were to take care of patients of Dr. Jones?
  • What is the expected length of stay for Dr. Jones if he were to take of patients of Dr. Smith?

Question 2:  The following data report patients with 10 Diagnostic Related Groups and 3 HCC indices cared for by a clinician and his peer group.  Data►

  •  Regress "Cared for by Dr. Smith" on the HCC and other patient charcteristics.  What type of patients are more likely to be cared for by Dr. Smith.  Regression Results►
  • Using regression results, identify which variables are likely to be in the Markov Blanket of the variable "Cared for by Dr. Smith".
  •  Use SQL to determine if the clinician is more efficient than his peer group. SQL► Mai's Teach One►

Question 3: The following data show the variation in diabetes in select counties across United States.  Using stratified covariate balancing report the impact of access to supermarkets on diabetes after controlling for other variables. Data►

  1. Check that all variables are positively and monotonely related to prevalence of diabetes in the county. Monotone?►
  2. Assign a binary variable to each variable in such a manner that when the variable is 1, diabetes is more likely.
  3. Drop from analysis covariates that are not parents on Markov Blanket of diabetes.  Accomplish this task using the following steps:
    • Regress diabetes on all variables (with no interaction terms in the model), identify variables that are signficant predictors of diabetes and have a large effect size
    • Do a second regression, verifying that no interaction terms that involve the signficant variables are predictors of diabetes (have a statistically signficant and large effect size). Include on the list of parents of Markov Blanket, any variable whose interactions is predictive of diabetes.
  4. Calculate the impact of access to food sources on diabetes, while controlling for other variables.   Accomplish this task by stratifying the variables identified as parents in the Markov Blanket, then switch the distribution of controls (low-diabetic counties) with distribution of cases (high diabetic counties).
  5. Report overlap and impact of food access on diabetes.

Question 4:  Predict 6-month mortality rate for 80 year residents with walking and toileting disabilities but no other disabilities.  Remove all events that occur after the patient is unable to walk and toilet. Verify that all variables lead to increased mortality. If not, re-name the variables so when the variable is assigned the value of 1, it has higher probability of mortality. Analyze the data using the following two methods:

  • Use of synthetic cases.  In the database there are not 30 cases with these two disabilities and 80 years of age.  Therefore, we would like you to estimate the survial days using synthetic case outcomes. You can create a synthetic case from 80 year olds who are unable to walk and residents who are unable to toilet. 
  • Use of closest frequent strata.  Stratify the data, using age, gender, and disabilities.  Identify partial match as strata in which patients have less disabilities as the target patient.  Identify excess match as strata in which patient has more disabilities than the target patient. Calculate the patient's mortality rate as the average of maximum of partial matches and minimum of excess matches. SQL►

Use the following dictionay of variables to create a header for the data. Data► Adel's Teach One►

Order Variable Description
1 ID Resident's ID
2 Age Age at first assessment
3 Sex Gender of resident
4 tAssess Number of assessments 
5 Followed Days resident followed
6 DaysFirst Days from first assessment
7 DaysLast Days to last assessment
8 uEat Unable to eat
9 uSit Unable to sit
10 uGroom Unable to groom
11 uToilet Unable to toilet
12 uBathe Unable to bathe
13 uWalk Unable to walk
14 uDress Unable to dress
15 uBowel Bowel incontinent 
16 uUrine Urine incontinent 
17 EverDead Patient dead at one point in time
18 AssessID Assessment ID
19 Dead6Months Dead within 6 months of assessment

Question 5:  The following data show the recovery from various disabilities in two nursing homes.  Two sets of data are presented.  The first set shows the disabilities of the patients at admission to the nursing home, using variables that start with "u", standing for "unable".  The recovery from the disabilities is also shown in variables that start with "r".  Compare the performance of these two nursing homes using distribution switch method.  In particular, switch the distribution for age, gender, and 9 disabilities on admission.   The outcome of interest is the number of disabilities recovered from (variable shown as nRecovery).  Use synthetic method to estimate outcome for cases not present in both nursing homes. Which nursing home has better outcome for its own residents?  What happens if residents at nursing home A were cared for at nursing home B, which nursing home would have better outcomes?  What will happen if the reverse happens?  Data► SQL & Answer►


  1. Practice profiling PubMed► 
  2. Importance of risk adjustment in measuring performance in primary care PubMed►

Prepared by Farrokh Alemi, Ph.D. This page is part of the course on Statistical Process Improvement