Learning Objectives
After completing the activities this module you should be able to:
- Clean data
- Estimate missing values
- Fit different types of models to data
- Interpret findings from statistical outputs pertaining to a regression model building technique
Lecture
- See previous lectures on introduction, assumption, missing values
and model building using ordinary regression
More►
- For a general introduction read Chapter 11 in Statistical Analysis of Electronic
Health Records by Farrokh Alemi, 2020
Slides►
YouTube►
Video►
Assignments
Assignments should be submitted in Blackboard. The submission must
have a summary statement, with one statement per question. All
assignments should be done in R if possible.
Question 1: The attached data show the percent of diabetes in different 2,228 counties within United States in 2010, 2011,
and 2012 years. We want to understand if access to food stores affects diabetes. Regress incidence of diabetes in 2012 on 2011 and 2010
variables.
- Test assumptions of regression
- Print out the coefficients of the regression.
- What percent of variation of diabetes is explained?
- List variables that take at least 2 years before they have a
significant effect on diabetes
- List variables that can affect diabetes in 1 year
Resources for Question 1:
Question 2: A cross-sectional study was conducted to examine depression in a sample of male, elderly subjects residing in a senior living
community. Depression was measured by a depression score. Several independent variables were included in the analyses such as
sociodemographic, psychological and physical health related measures as well as social support. All of the variables (dependent and
independent) are continuous. Using the results of this analysis, answer the following:
- State a possible research question (hypothesis and null hypothesis) based on the results shown.
- What statistical method was used to analyze the data based on the results shown and why?
- State any statistical assumptions associated with the statistical method used in the analysis.
- What can you conclude about the overall significance of the model?
- Provide an interpretation of the adjusted R square value?
- Provide an interpretation of the statistically significant findings (at an alpha value=0.05).
- Describe potential study implications for elderly individuals who reside in senior living communities based on the study results.
Resources for Question 2:
More
For additional information (not part of the required reading), please see the following links:
- Introduction to regression by others
YouTube►
Slides►
- Regression using R Read►
- Statistical learning with R
Read►
- Open introduction to statistics
Read►
This page is part of the HAP 819 course on Advance Statistics and was
organized by Farrokh Alemi PhD Home►
Email►
|