Benchmarking Clinicians

This week we discuss benchmarking and how you can compare apples-to-apples and oranges-to-oranges. Clinicians and their peer groups differ in types of patients they see. We will show you how to use data balancing to compare them on the same type of patients. We motivate the concepts using decision trees but then move quickly to using Structured Query Language for large data analysis.

This week is unusual for a course on process improvement. Statistical process control classes rarely talk about benchmarking. When they do, they rarely talk about data balancing. This stuff is the real frontier of the field. Stay with me as we go through the theoretical basis of it, concepts like distribution switches and synthetic cases may seem new and somewhat esoteric but stay with the material as the end result is practical and useful.

Assigned Reading

Chapter 17 in Big Data in Health Care: Statistical Analysis of Electronic Health Record Read►

Presentations

Use data balancing to benchmark clinicians Slides► YouTube► Video► Video Transcripts►
How to provide feedback to clinicians Slides► YouTube► Video►

Assignments

Instruction for Submission of Assignments: Assignments should be submitted directly on Blackboard. In rare situations assignments can be sent directly by email to the instructor. Submission should follow these rules:

Make sure that any control charts follow the visual rules below: (1) Control limits must be in red and without markers, (2) Observed lines must have markers, (3) X and Y axis must be labeled, and (4) Charts must be linked to the data.
Submit a summary file or put a summary at start of your response file. In the summary page you should list how your answers to the question differs from answers provided within the assignment (inside Teach One or other answers). You must indicate for each question if your control chart is exactly the same as seen in Teach One or other formats. For each question, you must indicate if the answers you have provided is the same as the answers supplied on the web. If there are no answers provided, you must indicate that there were no answers available on the web to compare your answers to.

Question 1: In the following question, use SQL, Python, or Excel to analyze the data. Assume that we want to compare two clinicians, Smith and Jones, who see different types of patients at different rates and different length of stay. The data for each clinician is provided in the two decision trees in Figure 1.

See exhibit 17.5 on page 419 in the required book for how to estimate missing length of stay for Dr. Smith.
Download data, or copy paste data from below Data►

Strata	Previous MI	CHF	Shock	Clinician	Previous MI Rate	CHF Rate	Shock Rate	Probability	Length of Stay
1	1	1		Smith	0.25	0.15	1	0.0375	4.5
2	1	0	1	Smith	0.25	0.85	0.7	0.14875	5
3	1	0	0	Smith	0.25	0.85	0.3	0.06375	4
4	0	1	1	Smith	0.75	0.7	0.85	0.44625	3
5	0	1	0	Smith	0.75	0.7	0.15	0.07875	5
6	0	0	1	Smith	0.75	0.3	0.25	0.05625	3
7	0	0	0	Smith	0.75	0.3	0.75	0.16875	4
8	1	1	1	Jones	0.85	0.65	0.3	0.16575	7
9	1	1	0	Jones	0.85	0.65	0.7	0.38675	5
10	1	0	1	Jones	0.85	0.35	0.3	0.08925	4
11	1	0	0	Jones	0.85	0.35	0.7	0.20825	3
12	0	1	1	Jones	0.15	0.7	0.45	0.04725	4
13	0	1	0	Jones	0.15	0.7	0.55	0.05775	5
14	0	0	1	Jones	0.15	0.3	0.65	0.02925	3
15	0	0	0	Jones	0.15	0.3	0.35	0.01575	2

Figure 1: Practice Patterns of Dr. Jones and Smith

What is the expected length of stay for each of the clinicians?
What is the expected length of stay for Dr. Smith if he were to take care of patients of Dr. Jones?
What is the expected length of stay for Dr. Jones if he were to take of patients of Dr. Smith?

Question 2: The following data report patients with 10 Diagnostic Related Groups and 3 HCC indices cared for by a clinician and his peer group. Is the clinician more efficient than his peer group. Contrast the clinician's performance to what would have happened if the peer group had taken care of the patients' of the clinician..

Download Data►
SQL Code►
Joanne Min's Python Teach One Slides► YouTube►
Zhu's Python Teach One Code► YouTube►

Question 3: The following table shows the observed and expected length of stay for 30 patients. Use paired comparison of means to test that the expected and observed length of stay are the same. Assuming normal distribution of the length of stay, use risk-adjusted control chart to plot the data. Make sure that control limits are derived from the expected values and observations are contrasted to these limits. This analysis can be done using Tukey or XmR and you need to select which chart produces tighter control limits. The conclusions you arrive at based on (a) paired comparison of expected and observed length of stay and (b) the risk-adjusted control charts should be the same if in both situations we were calculating the control limits from the same number of cases. Are they?

Data►
Answer►
Baafif's Excel Teach One YouTube►
Python Teach One for Tukey chart Slides► YouTube►
Rasib's XmR Python Teach One Slides► YouTube►
Akbar Soleimani Python Teach One Slides► YouTube►

Question 4: Use the procedure described for outcomes in synthetic cases, to estimate mortality rate for 80 year residents with walking and toileting disabilities but no other disabilities. Note that we want to rely on at least 30 cases in making this estimate. In the database there are not 30 cases with these two disabilities and 80 years of age. Therefore, we would like you to estimate the survival days using synthetic case outcomes. You can create a synthetic case from 80 year olds who are unable to walk and residents who are unable to toilet. Alternatively you can select a different set of residents, such as 80 year olds who are unable to toilet and residents who are unable to walk.

Data Download►
Adel's Teach One YouTube►
Al Qheedan's Excel Teach One YouTube►
Answer Using Regression More►
Ya Ting's Python Teach One Slides► YouTube►
Joanne Min's Python Teach One Slides► YouTube►

Also note that the data do not have headers. Use the following dictionary of variables to create a header for the data

Order	Variable	Description
1	ID	Resident's ID
2	Age	Age at first assessment
3	Sex	Gender of resident
4<4	tAssess	Number of assessments
5	Followed	Days resident followed
6	DaysFirst	Days from first assessment
7	DaysLast	Days to last assessment
8<8	uEat	Unable to eat
9	uSit	Unable to sit
10	uGroom	Unable to groom
1111	uToilet	Unable to toilet
12	uBathe	Unable to bathe
1313	uWalk	Unable to walk
14	uDress	Unable to dress
15	uBowel	Bowel incontinent
16	uUrine	Urine incontinent
17	EverDead	Patient dead at one point in time
18	AssessID	Assessment ID
19	Dead6Months	Dead within 6 months of assessment

Question 5: The following data show the recovery from various disabilities in two nursing homes. Two sets of data are presented. The first set shows the disabilities of the patients at admission to the nursing home, using variables that start with "u", standing for "unable". The recovery from the disabilities is also shown in variables that start with "r". Compare the performance of these two nursing homes using distribution switch method. In particular, switch the distribution for age, gender, and 9 disabilities on admission. The outcome of interest is the number of disabilities recovered from (variable shown as nRecovery). Use synthetic method to estimate outcome for cases not present in both nursing homes. Which nursing home has better outcome for its own residents? What happens if residents at nursing home A were cared for at nursing home B, which nursing home would have better outcomes now? What will happen if the reverse happens?

Data►

SQL & Answer►

Question 6: The following data report length of stay (LOS) for 10 patients of Dr. Jones and 10 patients of Dr. Smith. What is the expected outcome (average outcome) for Dr. Smith? What is the expected outcomes if Dr. Jones if he was seeing Dr. Smith's patients? To answer this question, replace each outcome of Dr. Jones with average outcome of same type of patient seen by Dr. Smith. Is Dr. Smith more efficient than Dr. Jones? Make sure that you submit an Excel sheet with formulas for all calculated values.

Data Dr. Jones's► Dr. Smith's► Type of Patients►
Answer►
Pierre-Louis Excel Teach One YouTube►
Maria B's Excel Teach One YouTube►
Amir Hinnaria's Python Teach One Slides► YouTube►

Dr. Smith
Patient	Previous MI	CHF	Shock	LOS
1	1	1	0	4
2	1	1	0	5
3	1	0	0	4
4	1	0	1	5
5	1	0	1	4
6	1	0	1	4
7	1	0	1	5
8	0	0	0	2
9	0	0	0	2
10	0	0	0	1

Dr. Jones
Patient	Previous MI	CHF	Shock	LOS
1	1	1	0	5
2	1	1	0	5
3	1	1	0	5
4	1	1	1	5
5	1	0	1	5
6	1	0	1	5
7	1	0	1	5
8	1	0	0	4
9	0	0	0	2
10	0	0	0	2

Question 7: In data presented in question B, what is the expected outcome if Dr. Smith sees patients of Dr. Jones? Note that Dr. Smith does not see any patient like patient 4 of Dr. Jones. We need to estimate a synthetic control for this patient. To do so, filter the data for patients of Dr. Smith (this is already done since the data of Dr. Smith is presented separately). Regress length of stay on previous MI, CHF and Shock. You learned about regression in the first part of this course. Evaluate the regression equation at values corresponding to the condition of patient 8 of Dr. Jones. Use the regression prediction of length of stay to create a synthetic patient for Dr. Smith and calculate the expected outcome for Dr. Smith seeing patients of Dr. Jones. Make sure that you submit Excel sheet with formulas for all calculated values.

Answer►
SQL►
Saneela Lakkakula's Python Teach One Slides► YouTube►
Khanal's Python Teach One Slides► YouTube►
Sai Keerthi's Python Teach One Slides► YouTube►

Practice profiling PubMed►
Importance of risk adjustment in measuring performance in primary care PubMed►

Prepared by Farrokh Alemi, Ph.D. This page is part of the course on Statistical Process Improvement

Benchmarking Clinicians

Assigned Reading

Presentations

Assignments

More