Lecture: Remove Confounding through Propensity Scoring  


Assigned Reading


  • Propensity Scoring
    • Read chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, pages 327 to 344   
    • Slides►
    • YouTube►
  • Propensity score quintile matching
    • Read chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, pages 332 to 337
  • Propensity Score with Inverse Probability Matching
  • Measuring treatment effects Read►
  • Matching on propensity scores Read►
  • Propensity scores and time to events Read►
  • Propensity scoring of cost data


Assignments should be submitted in Blackboard.  Include in the first page a summary page.  In the summary page write statements comparing your work to answers given or videos.  For example, "I got the same answers as the Teach One video for question 1." 

Question 1: The following data were collected for residents in the Medical Foster Home and in Nursing Homes.  The data is organized by quartiles of severity of illness.  Each quartile shows increasing participation in Medical Foster Home program, indicating that sicker patients are more likely to participate in the medical foster home program.  We want to remove the effect of participation in the program from our estimate of cost differences.  Report whether the residents in the Medical Foster Home have lower cost to residents in the Nursing Homes with similar likelihood of participation in the Medical Foster Home program. 

Severity of Illness  Quartiles Number of Residents Cost of care/day
Medical Foster Home Nursing Home Medical Foster Home Nursing Home
1 45 3,201 $2.71 $87.98
2 89 3,156 $4.44 $77.80
3 168 3,077 $11.31 $78.58
4 514 2,731 $31.82 $72.77
5 1,775 1,470 $109.62 $40.54

  • Answer in chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, pages 338.  Don't do this in R, it is a lot more work than needed. Note that the data here and the data in the book in page 335 differ in a significant way.  The data here is in quartile of severity of illness, while the data in the book are in quartile of propensity score. The correct way to solve this data is to estimate propensity weights and multiple costs by these weights to calculate average treatment effect.
  • Answer in Excel Image►

Question 2: Using the data, what is the inverse propensity weight (i.e. one ver the conditional probability of participating in the Medical Foster Home) for the 45 patients who fall in quintile 1.

Severity of Illness  Quartiles Number of Residents Cost of care/day
Medical Foster Home Nursing Home Medical Foster Home Nursing Home
1 45 3,201 $2.71 $87.98
2 89 3,156 $4.44 $77.80
3 168 3,077 $11.31 $78.58
4 514 2,731 $31.82 $72.77
5 1,775 1,470 $109.62 $40.54
  1. 0.014 
  2. 0.986
  3. 71.43
  4.  Cannot be determined
  5.  1.01
  6.  None of these  

Answer in chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, pages 337 to 338

Question 3: The following data provide the survival among cancer patients.  The data provides 35 common comorbidities for patients who have or don't have stomach cancer. Use both logistic and ordinary regression to analyze these data and report the difference of the findings, in particular:

  1. Using logistic regression, calculate the propensity to have cancer. 
  2. Group the diagnoses using SQL.  Within the naturally occurring groups of diagnoses, calculate probability of cancer.  Calculate the logit of the probability.  Regress the logit function on the diagnoses using ordinary regression. SQL►
  3. Report how the coefficients for the comorbidities of stomach cancer.  How do these coefficients change across the two methods?  

Question 4:  The objective of this analysis is to find response to antidepressants.  You can select one of the antidepressants.  These data come from STAR*D experiment conducted by National Institute of Medicine. The data are report bi-weekly or monthly.  There are 22,254 records for about 4,000 patients. Organize the data so there is one row for each patient.    

Focus: The enclosed data report on bupropion.  Please focus the analysis on only one of the antidepressants or a combination of two antidepressants taken simultaneously.    For the time being ignore the dose of the medication. 

Exclusions: Patients who did not receive bupropion are assumed to have received the alternative antidepressants.  The unit of the analysis is antidepressant trials and not necessary unique person.  So the ID that should be used is the combination of patient ID and Concat_Levels.  

Treatment: If the patient has taken the antidepressant at any time during the study period, then mark it as 1, otherwise 0. Notice that some patients have taken the medication and others have not.  Within the combination of ID and Concat_levels look for any occasion of use of bupropion.   

Covariates: For the covariates, include gender, risk of suicide, heart, vascular, haematopoietic, eyes ears nose throat larynx, gastrointestinal, renal, genitourinary, musculoskeletal Integument, neurological, psychiatric illness, respiratory, liver, endocrine, alcohol, amphetamine, cannibis use, opioid use, panic, specific phobia, social phobia, OCD, PTSD, anxiety, borderline personality, dependent personality, antisocial personality, paranoid personality, personality disorder, anorexia, bulimia, and cocaine use.  If the covariate is ever present assume that it is present. Exclude covariates that are not present for any of the patients.    

Outcome: The medication is considered to have caused the remission, if while on the medication, the patient is discharged to follow-up portion of the study, then "Treatment_plan_equal_3" is set to 1.  Use "Treatment_Plan_Equal_3" and not "Remission" variable as an indication of effectiveness of the antidepressant, since the remission variable does not indicate that the clinician was in agreement that the patients symptoms are well managed. 

Balance the data to remove the effects of covariates.  Show visually that you have successfully balanced the data.  Use the following steps to accomplish this:

  1. Calculate Propensity Score: Calculate the propensity of taking the antidepressant.  Regress taking of the antidepressant on the covariates. 
  2. Weights: Calculate inverse propensity weights
  3. Verify Balance: Verify that weighted regression removes the effects of all covariates.  Regress the antidepressants on the covariates and verify that none have a statistically significant effect on selection of the antidepressant.  Visually show that the data have been balanced. 
  4. Estimate Impact on Response: Regress response to the antidepressant on the covariates and taking the antidepressant.   Describe how well the model was balanced and how well the impact of antidepressant was estimated.
  • Data (Use instructor's last name as password) Downloadâ–ş
  • See also pages 338 through 342 inn Chapter 13 Propensity scoring for example R code
  • NIMH Sequenced Treatment Alternatives to Relieve Depression (STAR*D) Study Questions and Answers►
  • Teach one by Sankeerthi Mummidsetty Read► SQL code►
  • Solutions can be obtained using different software. Answer►

Question 5: The following provides the joint distribution of treatment and case mix severity for the patients in a hypothetical hospital.  The data provided is the joint distribution of treatment and case mix severity, i.e., p(treatment, case mix severity).  Calculate the propensity of participating in treatment, given that the case mix severity is low. This is the conditional probability of treatment given low severity, i.e. p(treated | low severity).  

Case Mix Severity Joint Probability of Events Probability of Treatment Given Case Mix
Untreated Treated Unknown
Low 0.36 0.08 0.44  
Med  0.12 0.12 0.24  
High 0.12 0.2 0.32  
All Strata 0.6 0.4 1  

For patients in low severity, calculate the inverse propensity for treatment.  Which of the following is correct?

  1. 5.5
  2. 0.08
  3. 1.69
  4. 0.44
  5. None of the above
  • Calculation of conditional probabilities from joint distribution is explained in "Chapter 3, Introduction to Probability and Relationships" in Statistical Analysis of Electronic Health Records in Big Data in Healthcare, pages 58 to 62 and pages 66 to to 68.

Question 6: The following problem was first created by Morgan and Harding and we have adjusted it to fit within health care. In this example, the outcome are length of stay in the hospital, the treatment is the clinician/his peer group and the strata are a mix of medical history and demographic variables that account for the pattern of self-selection into treatment.  This mix have been divided into 3 strata: low, medium and high risk. Notice that treated patients are likely to fall in the high strata.  Untreated patients are more likely to fall in the low strata. What is the impact of clinician on length of stay, after removing confounding associated with this selection bias.   

Strata Probability of Events   Strata Length of Stay Net Impact
Untreated Treated Both Untreated Treated
Low 0.36 0.08 0.44 Low 2 4 2
Med  0.12 0.12 0.24 Med  6 8 2
High 0.12 0.2 0.32 High 10 14 4
All Strata 0.6 0.4 1 Average 6.00 8.67 2.67
Probability weighted average 4.40 10.20  

  1. The answer is 2.67, calculated from the difference of the average impact, (4+8+14)/3 - (2+6+10)/3 = 2.67
  2. The answer is 5.8, calculated from the difference of expected value for treated and untreated.  Expected value for treated is 4(0.08/0.4) + 8(0.12/0.4) + 14(0.20/0.4) = 10.20 and expected value of untreated is 2(0.36/0.6) + 6*(0.12/0.6) + 10*(0.12/0.6) = 4.4
  3. The answer is 2.64 calculated as treatment difference in each strata weighted by the frequency of the strata 2(0.44) + 2*(0.24) + 4*(0.32) = 2.64
  4. Insufficient information is available to answer the question
  • What does ChatGPT say is the correct answer?  Ask your question from ChatGPT in a way that it can be answered by it.  We want to know if we should take the weighted average of the difference or the difference of weighted average.  Play with your question until you get an answer that is specific to this problem. Provide your question and ChatGPT answer in your submission.
  • For Morgan and Harding discussion of this problem in a different context see page 15 and 16 Read►
  • See chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, formula for average treatment effect, page 338.

Question 7:  The following data have been taken from nurses rounding in a facility.  The time they spent with patients has been recorded.  In addition, several characteristics of the patients have also been recorded and standardized.  Do any of the nurses have a significant impact on overall satisfaction in the unit? 

Question 8:  In a nursing home, data were collected on residents' survival and disabilities.  The data are listed in the following order: ID, age, gender (M for male, F for Female), number of assessments completed on the person, number of days followed, days since first assessment, days to last assessment, unable to eat, unable to transfer, unable to groom, unable to toilet, unable to bathe, unable to walk, unable to dress, unable to bowel, unable to urine, dead (1) or alive (0), and assessment number.  Predict from the patient's assessments (i.e. their age and disabilities at time of assessment) if the patient is likely to die and should be admitted to the hospice program. 

Question 9: What is the overlap between cases and controls and how does it affect study findings?

  1. When there is a low overlap between matched cases and controls, then study findings do not generalize to many situations.
  2. A case with extremely low weight may count for too many controls, thus findings are sensitive to changes in a single case
  3. Too much of an overlap between cases and controls is a waste of data as it reinforces the obvious 

For more, see chapter 13 Statistical Analysis of Electronic Health Records in Big Data in Healthcare, page 343.


For additional information (not part of the required reading), please see the following links:/p>

  1. A practical guide to propensity scoring using R Read►
  2. Guide to propensity scoring Read►

This page is part of the course on Comparative Effectiveness by Farrokh Alemi PhD Home►  Email►