Benchmarking & Clinician Profiles
Use data balancing to benchmark clinicians (use instructor's last
name as password)
Submit one file for all questions. Include all charts, code, and
output in the same file. Start each question in a separate page or
sheet. Include in the first page a summary page. In the summary page
write statements comparing your work to answers given or videos. For
example, "I got the same answers as the Teach One video for question 1."
In the following question, use SQL to analyze the data. Assume that we have followed two clinicians, Smith
and Jones, and constructed the decision trees in Figure 1.
Figure 1: Practice Patterns of Dr. Jones and Smith
- What is the expected length of stay for each of the clinicians?
- What is the expected length of stay for Dr. Smith if he were to take care of
patients of Dr. Jones?
- What is the expected length of stay for Dr. Jones if he were to take of patients
of Dr. Smith?
Question 2: The following data report
patients with 10 Diagnostic Related Groups and 3 HCC indices cared for by a clinician and
his peer group.
- Regress "Cared for by Dr. Smith" on the HCC and other
patient charcteristics. What type of patients are more likely to
be cared for by Dr. Smith. Regression Results►
- Using regression results, identify which variables are likely to
be in the Markov Blanket of the variable "Cared for by Dr. Smith".
- Use SQL to determine if the clinician is more efficient than
his peer group.
Mai's Teach One►
Question 3: The following data show the variation in
diabetes in select counties across United States. Using stratified
covariate balancing report the impact of access to supermarkets on
diabetes after controlling for other variables.
- Check that all variables are positively and monotonely related to prevalence of diabetes in the county.
- Assign a binary variable to each variable in such a manner that
when the variable is 1, diabetes is more likely.
- Drop from analysis covariates that are not parents on Markov
Blanket of diabetes. Accomplish this task using the following
- Regress diabetes on all variables (with no interaction terms
in the model), identify variables that are signficant predictors
of diabetes and have a large effect size
- Do a second regression, verifying that no interaction terms
that involve the signficant variables are predictors of diabetes
(have a statistically signficant and large effect size). Include
on the list of parents of Markov Blanket, any variable whose
interactions is predictive of diabetes.
- Calculate the impact of access to food sources on diabetes, while controlling for other variables. Accomplish
this task by stratifying the variables identified as parents in the
Markov Blanket, then switch the distribution of controls (low-diabetic
counties) with distribution of cases (high diabetic counties).
- Report overlap and impact of food access on diabetes.
Question 4: Predict 6-month mortality
rate for 80 year residents with walking and toileting
disabilities but no other disabilities. Remove all events that occur
after the patient is unable to walk and toilet. Verify that all variables
lead to increased mortality. If not, re-name the variables so when the
variable is assigned the value of 1, it has higher probability of
mortality. Analyze the data using the following two methods:
- Use of synthetic cases. In the database there are
not 30 cases with these two disabilities and 80 years of age.
Therefore, we would like you to estimate the survial days using synthetic
case outcomes. You can create a synthetic case from 80 year olds who
are unable to walk and residents who are unable to toilet.
- Use of closest frequent strata.
Stratify the data, using age, gender, and disabilities.
Identify partial match as strata in which patients have less
disabilities as the target patient. Identify excess match as
strata in which patient has more disabilities than the target patient.
Calculate the patient's mortality rate as the average of maximum of
partial matches and minimum of excess matches.
Use the following
dictionay of variables to create a header for the data.
Adel's Teach One►
||Age at first assessment
||Gender of resident
||Number of assessments
||Days resident followed
||Days from first assessment
||Days to last assessment
||Unable to eat
||Unable to sit
||Unable to groom
||Unable to toilet
||Unable to bathe
||Unable to walk
||Unable to dress
||Patient dead at one point in time
||Dead within 6 months of assessment
Question 5: The following data show the recovery from various
disabilities in two nursing homes. Two sets of data are presented.
The first set shows the disabilities of the patients at admission to the
nursing home, using variables that start with "u", standing for "unable".
The recovery from the disabilities is also shown in variables that start
with "r". Compare the performance of these two nursing homes using
distribution switch method. In particular, switch the distribution
for age, gender, and 9 disabilities on admission. The outcome
of interest is the number of disabilities recovered from (variable shown
as nRecovery). Use synthetic method to estimate outcome for cases not present in both nursing homes. Which nursing home has better
outcome for its own residents? What happens if residents at nursing
home A were cared for at nursing home B, which nursing home would have
better outcomes? What will happen if the reverse happens?
SQL & Answer►
- Practice profiling PubMed►
- Importance of risk adjustment in measuring performance in primary care PubMed►
This page is part of the course on