Lecture: LASSO Regression  

 

Assigned Reading

  • Tutorial on LASSO regression Python► R► You Tube►
  • Clusters of COVID-19 symptoms: Application of LASSO regression  Read► (use instructor's last name as password)

Assignment

Question 1: LASSO regress COVID-19 test results on COVID-19 symptoms. Identify the relative weight of each symptom, pair of symptom, and triplet of symptoms. Clarify if clusters of symptoms are more accurate than symptoms, by themselves.  Create 30 pairs of training and testing data. LASSO regress COVID-19 test results on its symptoms: Run the model 30 times using the 30 pairs. Then, report the average AUC Run the model one more time using all data as training data for the model to report the selected symptoms. LASSO regress COVID-19 test results on its symptoms, pairs of symptoms, and triplets of symptoms for the 30 pairs. Run the model 30 times using the 30 pairs. Then, report the average AUC Run the model one more time using all data (preprocessed data) as training data for the model to report the selected symptom.

Resources for Question 1:

  • Prepare the data by setting variables that are present to 1 and absent to 0. When the COVID-19 test result is missing, drop the case from the analysis. When a symptom is missing, replace it with its mode, almost always 0. Data►
  • 30 subset of IDs Data Subset►
  • Functions used in Python code Python►
  • Data pre-processing Python►
  • Main effect model Python►
  • Symptom cluster model Python►

Question 2: Using Graphical, or repeated, LASSO regressions, identify the relative weights of each symptom in diagnosis of COVID-19.  LASSO regress COVID-19 test results on the COVID-19 symptoms. Then, repeatedly LASSO regress COVID-19 symptoms on prior, earlier, patient symptoms.

Resources for Question 2:

  • COVID-19 test results and symptoms Data►
  • Count of number of times symptoms occur together for the same person Data►
  • Percent of times symptom listed in the row occurs before the symptom listed in the column Data►
  • Python code for repeated LASSO regressions, in order of variables' time of occurrence Python►

More


This page is part of the course on Comparative Effectiveness by Farrokh Alemi, PhD Home► Email►