HAP 786: Workshop in Health Informatics

Lecture: Connecting Language & Statistical Models

 
  Home  
 

Analyze antidepressant data
Generated by ChatGPT

Overview

Objectives

  1. Analyze data using one of the methods of health informatics program
  2. Review work done in prior classes

Assigned Reading

The reading material for this section of the course come from your previous courses and depend on the method you choose to use.  In analyzing response to antidepressants we use regression models:

  1. Make the AI advice more robust through including models for imputing missing values Slides► Narrated slides► Video► YouTube►
  2. Predicting response to antidepressants in general population PubMed►
  3. Source Code Part 2: Adding AI Predictors, Generating Prediction, and Executing Regressions PDF►
  4. Reference Data Mapping File for AI Predictors  CSV►
  5. Divya Bhavanam's Teach One on predicting from the AI system and All of Us data YouTube►
  6. Reduce computational and memory problems in All of Us data analysis YouTube►

Assignment

Instruction for Submission of Assignments: Submission should follow these rules:

  1. Include a statement by the project manger that submission is correct and assignments was done on time
  2. Submit your answers in a Jupyter Notebook Download► YouTube► Slides►
  3. Submit you answers in Canvas.

Task 1: Using your All of Us database, predict response to antidepressants among African American participants in All of Us database. Conduct the analysis in two steps:

  1. Describe the Population.  In this step you need to create Table 1 in your eventual report.  This Table should include the description of the population.  For examples of Table 1 see PubMed.  Provide a summary of your data that includes number of antidepressants examined, number of individuals involved, number of antidepressants discontinued, number of days individuals followed, number of days antidepressants continued, number of medical conditions at baseline of use of antidepressants, number of antidepressants used prior to baseline, experience with previous antidepressants. 

  2. Fit a Regression Model to the Response to One Antidepressant:  Create a model of direct predictors of response to antidepressants.  Include pairwise interaction of factors that predict response to antidepressants.  This may result in too many independent variables.  To reduce the number of independent variable use the SAFE procedure, where strong rules are used to exclude some independent variables.  There may be up to 100 predictors of response to the antidepressant.  Report the intercept, the predictors coefficients, the McFadden R-square, and any interaction term you have explored. 

  3. Predict the Predictors of Response to Antidepressant: Regress each predictor of response to antidepressants on all other prior variables in All of Us conditions.  Make sure that you create a new database and make sure that the regression response variable is measured after all independent variables.  Use LASSO regression.  Adjust hyper-parameter so that you will have about 10 predictors of the regression response variable.  Report the intercept. Report the unstandardized regression coefficients.  Report McFadden R-square and discard regression with low R-square.  In this task you may have to repeatedly do different regressions, please allocate sufficient time to complete it.  This is best done using SQL to drop irrelevant variables using SAFE rule.  Slides► Video►

Task 2: Write the method section of your final project report  Write the result section of your final project report.

 

 

Farrokh Alemi, Ph.D. Most recent revision 12/06/2024.  This page is part of the course on Workshop in Health Informatics