# Supplement to Chapter on Propensity Scoring

## Assignment

Question 1: The following data provide the length of stay of patients seen by Dr. Smith (Variable Dr Smith=1) and his peer group (variable Dr. Smith = 0).  Answer following questions:

1. Balance the data by propensity to seek care from Dr. Smith.  Graphically show that the weighting procedure results in same number of patients treated by Dr. Smith or his peer.
2. Report the unconfounded impact of Dr. Smith on length of stay.  Data► Kanfer's Teach One► Solution►  Kanfer and Lavanya's Answer►

Question 2: The following data provide the survival among cancer patients.  The data provides 35 common comorbidities for patients who have or don't have stomach cancer. Use both logistic and ordinary regression to analyze these data and report the difference of the findings, in particular:

1. Using logistic regression, calculate the propensity to have cancer.
2. Group the diagnoses using SQL.  Within the naturally occurring groups of diagnoses, calculate probability of cancer.  Calculate the logit of the probability.  Regress the logit function on the diagnoses using ordinary regression. SQL►

Report how the coefficients for the comorbidities of stomach cancer.  How do these coefficients change across the two methods?   Data► Answer by Shukri►

Question 3:  The objective of this analysis is to find response to antidepressants.  You can select one of the antidepressants.

1. These data come from STAR*D experiment conducted by National Institute of Medicine. Read about the study protocol. Protocol►
3. The data are report bi-weekly or monthly.  There are 22,254 records for about 4,000 patients. Organize the data so there is one row for each patient.    SQL►
• Focus: The enclosed data report on citalopram, bupropion, mirzapine, buspirone, lithium, nortriptyline, sertraline, thyroid, tranylclypromine, and venlafaxine.  Please focus the analysis on only one of the antidepressants or a combination of two antidepressants taken simultaneously.    For the time being ignore the dose of the medication.
• Exclusions: Patients who did not receive bupropion are assumed to have received the alternative antidepressant.  The unit of the analysis is antidepressant trials and not necessary unique person.  So the ID that should be used is the combination of patient ID and Concat_Levels.
• Treatment: If the patient has taken the antidepressant at any time during the study period, then mark it as 1, otherwise 0. Notice that some patients have taken the medication and others have not.  Within the combination of ID and Concat_levels look for any occasion of use of bupropion.
• Covariates: For the covariates, include gender, risk of suicide, heart, vascular, haematopoietic, eyes ears nose throat larynx, gastrointestinal, renal, genitourinary, musculoskeletal Integument, neurological, psychiatric illness, respiratory, liver, endocrine, alcohol, amphetamine, cannibis use, opioid use, panic, specific phobia, social phobia, OCD, PTSD, anxiety, borderline personality, dependent personality, antisocial personality, paranoid personality, personality disorder, anorexia, bulimia, and cocaine use.  If the covariate is ever present assume that it is present. Exclude covariates that are not present for any of the patients.  Combine covariates that occur occasionally.
• Outcome: The medication is considered to have caused the remission, if while on the medication, the patient is discharged to follow-up portion of the study, then "Treatment_plan_equal_3" is set to 1.  Use "Treatment_Plan_Equal_3" and not "Remission" variable as an indication of effectiveness of the antidepressant, since the remission variable does not indicate that the clinician was in agreement that the patients symptoms are well managed.
4. Balance the data to remove the effects of covariates.  Show visually that you have successfully balanced the data.  Use the following steps to accomplish this:
• Calculate Propensity Score: Calculate the propensity of taking the antidepressant.  Regress taking of the antidepressant on the covariates.
• Weights: Calculate inverse propensity weights
• Verify Balance: Verify that weighted regression removes the effects of all covariates.  Regress the antidepressants on the covariates and verify that none have a statistically significant effect on selection of the antidepressant.  Visually show that the data have been balanced.
• Estimate Impact on Response: Regress response to the antidepressant on the covariates and taking the antidepressant.
5. Describe how well the model was balanced and how well the impact of antidepressant was estimated.

Solutions can be obtained using different software.  Answer►

Question 4: The following problem was first created by Morgan and Harding and we have adjusted it to fit within health care. In this example, the outcome are length of stay in the hospital, the treatment is the clinician/his peer group and the strata are a mix of medical history and demographic variables that account for the pattern of self-selection into treatment.  This mix have been divided into 3 strata: low, medium and high risk. What is the impact of clinician on length of stay, after removing confounding associated with severity of the patients' illness?

 Strata Probability Total Strata Length of Stay Net Impact Untreated Treated Untreated Treated Low 0.36 0.08 0.44 Low 2 4 2 Med 0.12 0.12 0.24 Med 6 8 2 High 0.12 0.20 0.32 High 10 14 4 Total 0.60 0.40 1