Lecture: Mediation Analysis  



  • Create a counterfactual model of the data
  • Estimate mediation effect in real and counterfactual models of the universe

Assigned Reading

  • Tutorial on using regression for network construction Read► (Use instructor's last name as password)
  • Read about "Graphical Representation of Counterfactuals" in "Causal Inference in Statistics" pages 101-107
  • Overview of calculation of mediated effect of Fever through Chills on diagnosis of COVID-19 Read►
  • Path analysis of mediation coefficient Excel►
  • Mediation Analysis in 4 Steps:
    1. Learn the order of variables, in this case variables occur in order of
      1. Covariate C1,
      2. Covariate C2,
      3. Exposure X,
      4. Mediator M, and
      5. Outcome Y.
    2. Learn network through chain of regressions Slides►
    3. Estimate un-confounded impact of variable on outcome Slides►
    4. Re-estimate the un-confounded impact of variable on outcome.  Estimate mediation impact. Slides►


Question 1: Use LASSO regressions to create the network of symptoms and COVID-19 diagnosis.  Remove equations that explain less than 10% of variation in the response variables. Remove coefficients where the absolute value of coefficients are equal or less than 0.05.  Remove cycles, none should exists if you always regressed response variables on independent variables that occur prior to it. From the network calculate the following

  1. What is the order of occurrences of the symptoms, age, gender, and results of COVID-19 laboratory tests? 
  2. What are the direct predictors of COVID-19 Laboratory test results?  Assume the following order for the variables: D1: Age, D2: Female, X1: Shivering, X2: Fatigue, X3: Loss of taste, X4: Fever, X5: Headaches, X6: Loss of smell, X7: Chills, X8: Muscle aches X9: Diarrhea, X10: Cough, X11: Shortness of breath, X12: Runny nose, X13: Sore throat, X14: Loss of balance, X15: Vomiting, X16: Joint pain, X17: Loss of appetite, X18: Wheezing, X19: Difficulty breathing, X20: Excessive sweating, Y: COVID-19 Test Results.
  3. What is the best network that fits the data? Establish the structure of the network ignoring regressions that explain less than 10% of the variation in test results and ignoring variables where absolute value of coefficients are less than or equal to 0.05.
  4. Estimate the parameters of the network from repeated LASSO regressions.  Report the joint probability of COVID-19 positive test results, if we do not know which symptoms were present.  
  5. What are parents in the Markov blanket of Fever?
    • Use regressions to identify these parents in Markov Blanket of Fever
    • Use the network to read parents in Markov Blanket of Fever
  6. What is the un-confounded effect of fever on probability of positive COVID-19 diagnosis?
    • Use inverse propensity weights to removing confounding
    • Switch the distribution of direct predictors of Fever so that patients with and without Fever have the same distribution of direct predictors
  7. What is the parents in Markov blanket of Chills?
    • Use Network to identify the parents in Markov blanket of Chills
    • Use regressions to identify parents in Markov blanket of Chills
  8. LASSO regress Chills on its direct predictors, not including Fever.  Report intercept, coefficients, and McFadden R-square. 
  9. Revise the network to create a counterfactual network in which Fever is not mediated by Chills (no arc from Fever to Chills)
  10. What is the mediated impact of fever on COVID-19 through Chills?

Resources for Question 1:

  1. Data Download►
  2. Rachael King's Teach One YouTube►
  3. Yatisha Rajanala's Teach One Answers► Real Network► Counterfactual Network► Code► Netica Tables►
  4. Answers and overview of calculations Read►
  5. Analysis in the observed, "real," network, which includes a link from Fever to Chills

    Impact of Symptoms on COVID-19
    1. This network was drawn from repeated LASSO regressions of the data, ignoring R-square < 0.10 and coefficients < 0.05. Read►
    2. Netica network with associated tables Zip►
  6. Analysis in the Counterfactual network, which excludes the link between Fever and Chills:

    Counterfactual Network without Fever link to Chills
    1. Regression of Chills on prior predictors, excluding Fever as a predictor Read►
    2. Revised table for predicting Chills CSV file►
    3. Netica network with associated tables Zip►
  7. Percent of effect of Fever mediated through Chills Excel►


For additional information (not part of the required reading), please see the following links:

  1. Pearl's direct and indirect effects Read► Web Appendix►
  2. Saeed's lecture Video►
  3. Mediation analysis allowing for exposure-mediator interactions Read►
  4. Mediation analysis through stable weights Read►
  5. Practical guide to mediation analysis through inverse odds ratio Read► Slides►
  6. Mediation analysis revisited Read►

This page is part of the course on Comparative Effectiveness by Farrokh Alemi, PhD Home► Email►