﻿ COVID-19 Symptom Screening

# Lecture: At Home Diagnosis of COVID-19

## Assignment

For this assignment you can use any statistical software.  The data in this assignment has changed from Teach One documents previously posted.  In addition, you are asked to report specific parts of your findings.

Question 1:  Classify COVID-19 based on its symptoms.

• Describe the order of occurrence of the variables
• Assume that age, and gender occur at birth.  Assume that vaccination information occurs before onset of symptoms.  Assume home tests occurs after onset of symptoms.  Assume that laboratory PCR test occurs after home test.
• Establish the order with which symptoms occur
• Count for each pair of symptoms, the number of times one symptom occurs before another.  Use column AK in the database to identify if one symptom has occurred before another.
• Use the pairwise count of one symptom occurring before another to establish a sequence of occurrence of symptoms.
• Create a table reporting for each variable what other variables precede it.

Table 1: Portion of Table showing number of times row variable occurs before column variable (number of pairs of symptoms occurring)
(Gray cells indicate factors that do not occur before column variables for majority of patients)

 Swelling Loss of Appetite Chest Pain Chills Cough … Swelling NaN 0 (3) 1 (3) 0 (4) 0 (7) … Loss of Appetite 0 (3) NaN 2 (8) 1 (14) 1 (21) … Chest Pain 1 (3) 1 (8) NaN 2 (8) 2 (10) … Chills 0 (4) 2 (14) 4 (8) NaN 5 (28) … Cough 2 (7) 5 (21) 3 (10) 3 (28) NaN … … … … … … … …
• Create a Causal Network for clusters of symptoms of COVID-19
• Create the structure of the network:
• Using logistic LASSO, regress the PCR test results on all variables and pairwise or triple cluster of variables that precede it.
• List the variables that are direct predictors of PCR test results.  This list should include the coefficients for the non-zero Logistic regression variables, including coefficients for pairs or triple of variables.
• Report the percent of variation explained by the LASSO regression of PCR tests on independent variables.  Calculate and report the McFadden Pseudo R-Square.
• Using LASSO, regress each variable that is a direct predictor of PCR test results on all preceding variables. In this regression, the statistically significant variables are parents in the Markov blanket of the regression response variable.
• For each regression, report the independent variables that are significant (non-zero) predictors of the response variable (the response variables are the direct predictors of PCR tests)
• For each regression, report the percent of variation explained by the regression
• Draw the network using Netica.
• Provide an image of the structure of the network, organized so that nodes that occur later are put to the right of nodes that occur earlier.  Please note that if you do not have a license to Netica, you can make the network and take a screen shot before you save the network and need a license.
• Estimate the parameters of the network
• Using the LASSO regression, calculate the predicted value for all combinations of the parents in the Markov blanket of the regression's response variables. Enter this information into Netica Tables.
• In Excel or in Netica provide tables predicting probability of each node in the network. Provide the table for predicting fever in a word document as well.
• What is the probability of COVID for a patient less than 30, female, with runny nose, muscle aches, and with unknown fever status. What is the same probability if we knew that the patient does not have COVID.
• Report the two probabilities

The following resources may be helpful:

This page is part of the course on Comparative Effectiveness by Farrokh Alemi, Ph.D. Home► Email►