Statistical Process Improvement Georgetown University

Probability Charts


In this section, we show you how to analyze the data from  improvement efforts involving frequency of events.  We assume that you have  collected data about a key indicator over several weeks and that you need to  analyze the data.  We will teach you how to analyze the data and we will  even provide you with templates that you can use to quickly complete your tasks.  

The objectives of this session are:

  • Chart data using P-chart.
  • Chart data using risk-adjusted P-charts.
  • Decide if the changes you have introduced have led to real  improvements.

Why chart?

There are two reasons why to construct a  control chart.

  1. To discipline intuitions. Data on human judgment show  that we, meaning all of us including you, have a tendency to attribute  system improvements to our own effort and skill and system failure to chance  events. In essence, we tend to fool ourselves. Control charts help see  through these gimmicks. It helps us see if the improvement is real or we  have just been lucky.

    An example may help you understand my point. In an air force,  a program was put in place to motivate more accurate bombing of targets  during practice run. The pilot who most accurately bombed a target site, was  given a $2,000 bonus for that month. After a year of continuing the program,  we found an unusual relationship. Every pilot who received a reward did  worse the month after. How is this possible? Rewards are supposed to  encourage and not discourage positive behavior. Why would pilots who did  well before do worse now just because they received a reward? The  explanation is simple. Each month, the pilot who won did so not because  he/she was better than the others but because he/she was lucky. We were  rewarding luck; thus, it was difficult to repeat the performance next month.  Control chart helps us focus on patterns of changes and go beyond a focus on  luck.

    In a field like medicine, over time a poor outcome occurs  the natural tendency is to think of it as poor care. But such rash judgments  are misleading. In an uncertain field such as medicine, from time to time  there will be unusual outcomes. If you focus on these occasional outcomes,  you would be punishing good clinicians whose efforts have failed by chance.  Instead, focus on patterns of good or bad outcomes. Then you know that the  outcomes are associated with the skills of the people and the underlying  features of the process and not just chance events. Control charts help you  see if there is a pattern in the data and move you away from making  judgments about quality through case by case review.

  2. To tell a story. P-charts display the change over time.  These charts tell how the system was performing before and after we changed  it. They are testimonials to the success of our improvement efforts. Telling  these types of stories helps the organization to:

    • celebrate small scale successes, an important step in  keeping the momentum for continuous improvement.
    • communicate to others not part of the cross functional  team. Such communications help pave the way for eventual organization  wide implementation of the change.

    You can of course present the data without plotting it and  showing it. But without charts and plots, people not be able to see the  data. Numbers makes people understand the data but plots and charts,  especially those drawn over time, make people connect a story with the data,  they end up feeling what they have understood. For many people seeing the  data is believing it. When these charts are posted publicly in the  organization, control charts prepare the organization for change. They  transfer and explain the experience of one unit of the organization to other  units.

Which chart is right?

Figure 1:  P-chart is best when there are multiple observations 
& outcomes are dichotomous and not rare

When tracking data over time, you have a number of options.   You could use a P-chart, designed specifically to track mortality or adverse  health events over time.   You could use an X-bar chart designed for  tracking health status and satisfaction surveys of a group of patients over  time.  You could use a moving average chart to help you construct control  chart for an individual patient's data over time.  This section helps you  decide which of these various charts are appropriate for your application.   If you do not have a specific application in mind or if you wish to learn more  about each of the various different charts, skip this section.  In the  following, we ask you 4-7 questions and based on your answers advise you which  chart is right for the application that you have in mind.

Have you collected observations  over different time periods? Yes No


When analyzing adverse health outcomes,  such as mortality, a useful method of analysis is P-chart. In the P-chart we  assume the following:

  • The outcome is dichotomous, mutually exclusive and exhaustive events. Dichotomous means that there are only two outcomes. Mutually exclusive means that these two outcomes cannot both occur. Mutually exhaustive means than one of these two outcomes must happen. Thus p-chart may be considered appropriate for analysis of mortality rates if we agree that there are only two outcomes of interest (alive and dead) and that it is not possible to be both alive and dead, or to be in a state other than alive or dead. 
  • The outcome is measured over time to track improvements in the process.  
  • The observations over time are independent. Meaning that the probability of adverse outcomes for one patient does not affect the adverse outcome of the other patient. This is not always true. In infectious diseases, one patient affects another. When infection breaks in a hospital ward, the use of P-chart to analyze the outcomes of the process is inappropriate.  
  • Risk adjustment of the P-chart requires the availability of accurate predictions of risk faced by each patient in our sample. These predictions can be made based on a number of commercially available severity indices or based on the clinician's opinions. In either case, the quality of the analysis depends on the quality and the availability of these predictions.  
  • The analysis assumes that the sample of patients examined represent the population of patients treated at the specific time period.

These assumptions are important and should be verified before  proceeding further with the use of risk adjusted P-charts. When these  assumptions are not met, alternative approaches such as bootstrapping  distributions should be used.


This section takes you through a step by step process of using a  P-chart; a type of control chart for analysis of mortality data. We introduce  the concepts behind P-charts in several steps. It is important that you take  each step and complete the assignment in the step before proceeding to the next.  To help you understand the concepts, the lecture focuses on data from one  hospital's mortality over 8 consecutive months.  Here is the data we need to analyze:

Time Period Number of cases Number dead
1 186 49
2 117 24
3 112 25
4 25 3
5 39 15
6 21 5
7 61 16
8 20 9
Table 1:  Mortality Data

Calculate Mortality Rates

The first step is to create an x-y plot; where the x axis is  time and the y axis is mortality rates. Calculate mortality rates by dividing  number dead by the number of cases in that month.

Time Period Number of cases Number dead Observed mortality
1 186 49 0.26
2 117 24 0.21
3 112 25 0.22
4 25 3 0.12
5 39 15 0.38
6 21 5 0.24
7 61 16 0.26
8 20 9 0.45
Table 2:  Observed Mortality Rates

Numbers are deceiving. They hide much. To understand numbers you must see them by plotting them. Figure 1 shows the data plotted against time. 

Figure 2:  Observed Mortality in Eight Time Periods

What does the plot in Figure 1 tell you about unusual time  periods?  There are wide variations in the data.  It is difficult to  be sure if the apparent improvements are due to chance.  To understand if  these variations could be due to change, we can plot on the same chart two  limits in such a manner that 95% or 99% of points would by mere change fall  between the lower and upper limits. 

Setting limits

Figure 3 shows the steps in calculating control limits.

Figure 3:  Steps in Calculating Control Limits in P-charts

In step one the grand average p is calculated by dividing the total adverse  events by the total number of cases.  Note that averaging the rates at  different time periods will not yield the same results. Calculate the total  number of cases and the total number of deaths. The ratio of these two numbers  is the average P, the average mortality rate. Next calculate the standard deviation of the data. In a binomial  distribution the standard deviation is the square root of grand average p  multiplied by one minus grand average p divided by the number of cases in that  time period. For example, if the grand average p is .25 and the number of cases  in the time period is 186, then the standard deviation is the square root of  (.25)*(.75)/(186). Table 3 shows the calculated standard deviations for each  time period:

Time Period Number of cases Number dead Observed mortality Standard deviation
1 186 49 0.26 0.03
2 117 24 0.21 0.05
3 112 25 0.22 0.05
4 25 3 0.12 0.1
5 39 15 0.38 0.08
6 21 5 0.24 0.11
7 61 16 0.26 0.06
8 20 9 0.45 0.11
Grand average p = 0.25
 Table 3:  Standard Deviations for Each Time Periods

Calculate the upper lower limit for each time period as grand  average p plus 3 times the standard deviation. This means that you are setting  the control limits so that 99% of the data should fall within the limits. If you  want limits for 90% or 95% of data you can use other constants besides 3.   The constant you use depends on number of observations in the time period and  can be read from table of t-values for different  sample sizes.  Table below shows the lower and upper control limits:

Time Period, Number of cases Number dead Observed mortality Standard deviation LCL UCL
1 186 49 0.26 0.03 0.16 0.35
2 117 24 0.21 0.05 0.11 0.39
3 112 25 0.22 0.05 0.11 0.39
4 25 3 0.12 0.10 0.00 0.55
5 39 15 0.38 0.08 0.01 0.49
6 21 5 0.24 0.11 0.00 0.58
7 61 16 0.26 0.06 0.06 0.44
8 20 9 0.45 0.11 0.00 0.59
Grand average p 0.25        

Table 4:  Lower (LCL) and Upper Control Limits

Please note that negative control limits in time periods 4, 6  and 8 are set to zero because it is not possible to have a negative mortality  rate.  Also note that the upper and lower control limit change in each time  period. This is to reflect the fact that we have different number of cases in  each time period. When we have many observations, we have more precision in our  estimates and the limits become tighter and closer to the average p. When we  have few observations, the limits go further away from each other.

Remember that we are trying to answer the question of whether  there has been improvements in the process. The control limits help answer this  question. If during a time period we have more mortality than can be expected  from chance then the process has deteriorated during that period. Any point  above the UCL indicates a potential change for the worst in the process. Any  point below the LCL indicates that mortality is lower than can be expected from  chance. It suggests that the process has improved. Figure 4 shows the resulting  plot:

Figure 4:  P-chart for Data in Table 1

Notice the peculiar construction of the plot, designed to help  attract the viewers attention to observed rates.  The observed rates are  shown as single marker connected with a line.  Any marker than falls  outside the limits is circled and highlighted.  The control limits are show  as a line without markers.   In the plot in Figure 3, all the data  points are within limit.

Risk Adjusted P-chart

See how to create risk adjusted P-charts in Excel  Video►   SWF►

P-charts were designed for monitoring the performance of  manufacturing firms. These charts assume that the input to the system is the  same at each time period. In manufacturing this makes sense. The metal needed  for making a car changes little over time. But in health care this makes no  sense. People are different. People are different in their severity of illness,  in their ability and will to recover from their illness and in their attitudes  toward heroic interventions to save their lives. These differences affect the  outcomes of care. If these differences are not accounted for, we may mistakenly  blame the process when poor outcomes were inevitable and praise the process when  good outcomes were due to the type of patients arriving at our unit.

Some institutions receive many severely ill patients. These  institutions would be unfairly judged if their outcomes are not adjusted for  their case mix before comparing them to other institutions. Similarly, in some  months of the year, there are many more severely ill patients. For example,  seasonal variations affect the severity of asthma. Holidays affect both the  frequency and the severity of trauma cases.

But even more significant source of change in the severity of  illness of our patients is our own actions. Many process changes lead to changes  in the kinds of patients attracted to our unit. Consider for example, if we  aggressively try to educate patients for the need for avoiding C-section, we may  get a reputation for normal birth delivery and we may attract patients who have  less pregnancy complications and wish for normal birth delivery. In the end, we  have not really reduced c-sections in our unit, all we have done is to attract a  new kind of patient who does not need cesarean births. Nothing fundamentally has  changed in our processes, except for the input to the process.

Risk adjustment of control charts is one method of making sure  that the observed improvement in the process are not due to changes in the kind  of patients that we are attracting to our unit.  To help you understand  this method of analysis, suppose we have collected the data in Table 5 over 8  time periods.   This table shows the patients severity of illness  (risk of mortality).  If you have forgotten what is severity and how we  measure it please click  here to return to a previous lecture on this issue.  

  Observed Mortality
  Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8
# deaths 2 3 1 3 2 2 4 2
 # cases 8 9 7 7 7 7 7 8
  Mortality Risks for Individual Patients
Estimated from Clinicians' Consensus
  Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8
Case 1 0.18 0.97 0.85 0.27 0.12 0.07 0.96 0.05
Case 2 0.88 0.88 0.61 0.71 0.44 0.05 0.05 0.96
Case 3 0.33 0.04 0.27 0.07 0.18 0.93 0.75 0.96
Case 4 0.29 0.29 0.28 0.74 0.67 0.24 0.04 0.14
Case 5 0.14 0.03 0.8 0.08 0.51 0.14 0.96 0.05
Case 6 0.24 0.19 0.71 0.04 0.62 0.58 0.71 0.58
Case 7 0.15 0.14 0.85 0.76 0.67 0.05 0.15 0.07
Case 8 0.04 0.74           0.16
Case 9   0.07            
Table 5:  Mortality Risks of Individual Patients

The question we want to answer is whether the observed mortality  rate should have been expected from the patients severity of illness (individual  patient's risk of mortality).  To answer this question, we need to  calculate control limits.  Risk adjusted control limits for probability  charts are calculated using the steps in Figure 5:.

Figure 5:  Steps in Calculation of Risk Adjusted  Control Limits for Probability Charts

The upper and lower control limits are calculated from the  expected risk, Ei, the expected deviations, Di, and the student-t distribution constant.  Each of these are further defined and  explained below. 

Expected Mortality

The expected mortality rate for each time period is calculated  as the average of the risks of mortality of all the patients in that time  period. These calculations are shown in Table 6:

Time period 1 2 3 4 5 6 7 8
Observed mortality 0.25 0.33 0.14 0.43 0.29 0.29 0.57 0.25
Expected mortality 0.28 0.37 0.62 0.38 0.46 0.29 0.52 0.37
Table 6:  Expected  Mortality Rates


Expected Deviation

Before we construct control limits for the expected mortality,  we need to measure the variation in these values.  The variation is  measured by a statistic that we call expected deviation.  It is calculated  in four steps:

  1. The risk of each patient is multiplied by one minus the risk of the same patient.
  2. The multiplied numbers are added for all patients in the same time period.
  3. The square root of the sum is taken.
  4. The expected deviation is the square root of the sum divided by the number of cases.

Figure 6:  Calculation of Expected Deviation

Figure 5 shows calculation of expected deviation for the first  time period.  The same calculation should be carried through for each time  period, resulting in the data in Table 7:

Time period 1 2 3 4 5 6 7 8
Observed mortality 0.25 0.33 0.14 0.43 0.29 0.29 0.57 0.25
Expected mortality 0.28 0.37 0.62 0.38 0.46 0.29 0.52 0.37
Expected deviation 0.13 0.11 0.16 0.14 0.17 0.13 0.12 0.11

Table 7: Expected  Deviation for All Time Periods


To calculate the control limits we need to estimate the  t-statistic that would make sure that 95% or 99% of data will fall within the  control limits.  T-values depend on the sample size.  To see a Table  of "t" values for different sample sizes click here.

Table 8 summarizes the estimated t-values for all time periods:

Time period 1 2 3 4 5 6 7 8
Number of cases 8 9 7 7 7 7 7 8
Observed mortality 0.25 0.33 0.14 0.43 0.29 0.29 0.57 0.25
Expected mortality 0.28 0.37 0.62 0.38 0.46 0.29 0.52 0.37
Expected deviation 0.13 0.11 0.16 0.14 0.17 0.13 0.12 0.11
T-value 2.37 2.31 2.45 2.45 2.45 2.45 2.45 2.37

Table 8:  Estimation of Student t-Values


Plotting a Risk  Adjusted P-Chart

We are now reading to calculate the control limits and plot the chart.   The upper and lower control limits are calculated from the expected mortality  and expected deviation so that 95% of the data would fall within these limits  (i.e. we use a t-value appropriate for 95% confidence intervals):

Time period 1 2 3 4 5 6 7 8
Number of cases 8 9 7 7 7 7 7 8
Observed mortality 0.25 0.33 0.14 0.43 0.29 0.29 0.57 0.25
Expected mortality 0.28 0.37 0.62 0.38 0.46 0.29 0.52 0.37
Expected deviation 0.13 0.11 0.16 0.14 0.17 0.13 0.12 0.11
T-value 2.37 2.31 2.45 2.45 2.45 2.45 2.45 2.37
LCL -0.03 0.12 0.23 0.04 0.04 -0.03 0.23 0.11
UCL 0.59 0.62 1.01 0.72 0.88 0.61 0.81 0.63

Table 9:  Calculation of Upper and Lower Control Limits

Since negative probabilities do not make sense, we re-set the  negative numbers to 0.  In a risk adjusted p-chart, we plot the observed  rate against the control limits derived from expected values.  Figure 7  shows the resulting chart:

Figure 7:  Risk Adjusted P-Chart for Data in Table 5

One of the data points in Figure 6 falls outside the control  limits.  We have drawn a circle around this data point to attract attention  to it. Points above the upper control limit show time periods when outcomes have  been worse than expected from the patients' risks. Points below the control  limit show time periods when outcomes have been better than expected.  In  time period 3, mortality rates were less than expected.


There are three sets of presentations for this lecture:

  1. Lecture on P-chart  Slides► Listen► SWF►
  2. Lecture on Risk Adjusted P-charts  Slides► Listen► SWF►
  3. Introduction to Control Chart  Slides► SWF►
  4. See how to calculate Expected Deviations  Excel 2003►  SWF►
  5. Creating risk adjusted P-chart  Video► SWF► Excel 2003►

Narrated slides and videos require Flash.


  • See a recent tutorial on how to create a risk adjust control chart.  More►

  • Most health care managers do not publish the control charts of their data and therefore it is difficult to have access to these examples.  But some do, and their analysis and data are available through Medline.  PubMed►
  • Michael Cleary, Ph.D. and iSixSigma give a case study for on-time medication delivery. More 
  • Seen an example of use of control charts in preventing hospital falls.  More► 
  • For an example of risk-adjusted quality control charts see Gustafson's approach to infection control.  More►

Analyze Data


Step by step analysis of risk-adjusted P-chart similar to Analyze Data for this week.  Video► SWF►
More details on calculations of Expected Deviations  Video► SWF►
To analyze this week's data you would need the Student t-values.  More►

Advanced learners like you, often need different ways of understanding a  topic. Reading is just one way of understanding. Another way is through  analyzing data.  The enclosed questions are designed to get you to think  more about the concepts taught in this session.

1.  Assume that following data were obtained about number of falls in a Nursing Home facility. 


Time Period











of Falls










Number of











Fall  Risks in each time  period for each case 
 Case Time 1 Time 2 Time 3 Time 4 Time 5 Time 6 Time 7 Time 8 Time 9
1 0.25 0.55 0.4 0.15 0.55 0.75 0.2 0.35 0.4
2 0.4 0.25 0.7 0.45 0.6 0.45 0.15 0.8 0.5
3 0.7 0.4 0.6 0.7 0.45 0.05 0.1 0.5 0.25
4 0.4 0.45 0.55 0.8 0.5 0.9 0.25 0.55 0.7
5 0.15 0.2 0.7 0.45 0.65 0.5 0.6 0.75 0.4
6 0.2 0.65 0.6 0.6 0.65 0.6 0.7 0.35 0.55
7 0.5 0.1 0.55 0.25 0.25 0.7 0.4 0.6 0.3
8 0.5 0.5 0.3 0.1 0.35 0.35 0.35 0.45 0.75
9 0.3 0.75 0.65 0.8 0.6 0.65 0.5 0.3 0.2
10 0.2 0.35 0.6 0.4 0.4 0.4 0.75 0.65 0.6
11 0.4 0.65 0.05 0.25 0.35 0.6 0.65 0.75 0.55
12 0.3 0.2 0.25 0.65 0.1 0.25 0.7 0.4 0.6
13 0.45 0.65 0.45 0.8 0.4 0.75 0.55 0.45 0.65
14 0.25 0.3 0.65 0.25 0.5 0.3 0.65 0.55 0.75
15 0.25 0.25 0.7 0.6 0.25 0.25 0.7 0.35 0.6
16 0.4 0.45 0.6 0.8 0.65 0.4 0.35 0.75 0.75
17 0.45 0.3 0.25 0.85 0.25 0.75 0.65 0.6 0.45
18 0.35 0.5 0.75 0.45 0.45 0.75 0.4 0.25 0.45
19 0.25 0.75   0.5 0.7 0.55 0.7 0.5  
20 0.1 0.6   0.2 0.6 0.7   0.65  
21       0.45          

Produce a control chart.  Make sure that your control chart does not have any of the following typical errors:

  1. The chart includes un-named labels such as "Series 1" and "Series 2."
  2. The markers in the control line were not removed.
  3. The X-axis does not have a title
  4. The Y-axis does not have a title
  5. Colors used in the chart and in the cell values, do not help in understanding of the work.
  6. Except for the data, all cell values should be calculated as a formula.  If this is not the case, it is very important that you point this out.  You should be able to change a data value and all calculations should change automatically.

Email your instructor.  Attach your Excel file.  In the subject line include the course number and your name.  For example, subject line could be:  "Joe Smith from HAP 586 analysis of data in lecture on P-chart"   Please submit one file.  Please note that all cell values must be calculated using a formula from the data.  Do not enter values in any calculated cells.  Calculate each cell using Excel formulas.  Make sure that legend, the X-axis and the Y-axis are appropriately labeled in the chart.   Keep a copy of all assignments till end of semester.  Email


  1. Risk adjusted probability charts More►

  2. Independence of observations  More►

This page is part of the course on quality, the lecture on probability charts.  This presentation was based on Alemi F, Rom W,  Eisenstein E. Risk adjusted control charts for health care assessment.  Annals of Operations Research, 1996. Created on Tuesday, September 17, 1996.  Most recent revision 01/15/17.