This week we discuss benchmarking and how you can compare apples-to-apples and oranges-to-oranges. Clinicians and their peer groups differ in types of patients they see. We will show you how to use data balancing to compare them on the same type of patients. We motivate the concepts using decision trees but then move quickly to using Structured Query Language for large data analysis.
This week is unusual for a course on process improvement. Statistical process control classes rarely talk about benchmarking. When they do, they rarely talk about data balancing. This stuff is the real frontier of the field. Stay with me as we go through the theoretical basis of it, concepts like distribution switches and synthetic cases may seem new and somewhat esoteric but stay with the material as the end result is practical and useful.
Question 1: In the following question, use SQL to analyze the data. Assume that we have followed two clinicians, Smith and Jones, and constructed the decision trees in Figure 1. Data► Atoosa's Excel► Pooja's Video► SQL Code►
Question 2: The following data report patients with 10 Diagnostic Related Groups and 3 HCC indices cared for by a clinician and his peer group. Use SQL to determine if the clinician is more efficient than his peer group. Data► SQL►
Question 3: The following table shows the observed and expected length of stay for 30 patients. Use paired comparison of means to test that the expected and observed length of stay are the same. Assuming normal distribution of the length of stay, use risk-adjusted control chart to plot the data. Make sure that control limits are derived from the expected values and observations are contrasted to these limits. This analysis can be done using Tukey or XmR and you need to select which chart produces tighter control limits. The conclusions you arrive at based on (a) paired comparison of expected and observed length of stay and (b) the risk-adjusted control charts should be the same if in both situations we were calculating the control limits from the same number of cases. Are they? Data► Answer►
Question 4: Use the procedure described for outcomes in synthetic cases, to estimate mortality rate for 80 year residents with walking and toileting disabilities but no other disabilities. Note that we want to rely on at least 30 cases in making this estiamte. In the database there are not 30 cases with these two disabilities and 80 years of age. Therefore, we would like you to estimate the survial days using synthetic case outcomes. You can create a synthetic case from 80 year olds who are unable to walk and residents who are unable to toilet. Alternatively you can select a different set of residents, such as 80 year olds who are unable to toilet and residents who are unable to walk. Also note that the data do not have headers. Use the following dictionay of variables to create a header for the data. Data► Adel's Teach One►
Question 5: The following data show the recovery from various disabilities in two nursing homes. Two sets of data are presented. The first set shows the disabilities of the patients at admission to the nursing home, using variables that start with "u", standing for "unable". The recovery from the disabilities is also shown in variables that start with "r". Compare the performance of these two nursing homes using distribution switch method. In particular, switch the distribution for age, gender, and 9 disabilities on admission. The outcome of interest is the number of disabilities recovered from (variable shown as nRecovery). Use synthetic method to estimate outcome for cases not present in both nursing homes. Which nursing home has better outcome for its own residents? What happens if residents at nursing home A were cared for at nursing home B, which nursing home would have better outcomes now? What will happen if the reverse happens? Data►