HAP 719 Advanced Statistics

HAP 719: Advanced Statistics I

Exam 2024  

 

Submit Jupyter file in a PDF format.  Include both code and your findings. 

Question 1: Use the following corpus of training data to classify the sentiment in the following sentence: The doctor was terrible but the nurses were terrific.     

  1. Regress the classification labels in the training set on the words in the target sentence: "doctor", "was" "terrible" "but" "nurses" terrific".   
  2. Regress the classification labels in the training set on the words, pair of words, triplet of consecutive words in the target sentence. 
  3. Which coefficients for combination of words are statistically significant?

Question 2: Predict factors that affect cost of healthcare services.  The factors to consider are (a) above 65 years old, (b) diabetes, (c) kidney problems, and (c) depression.

  1.  Regress cost on each independent variable.
  2. Regress cost on each independent variable, pair or triplet of variables
  3. Which combination of variables has a statistically significant relationship to the cost?

This page is part of the HAP 719 course on Advanced Statistics I by Farrokh Alemi PhD Home► Email►