Basic data  

HAP 719: Advanced Statistics I

 

Distributions

Normal Distribution 2 dimensional
Image Made by ChatGPT

Overview

This module provides an understanding of normal distributions, enabling students to describe their key characteristics and estimate the probability of events using the standard normal curve. Additionally, the module covers techniques for transforming non-normal distributions into normal distributions to facilitate statistical analysis.

Learning Objectives

  • Describe normal distribution
  • Estimate probability of events from its distribution
  • Convert non-normal distributions to normal distribution
  • Describe data using descriptive statistics

Lecture

AI assisted indicates AI assisted material. 

  • Read Chapter 4, Distributions in Big Data in Healthcare: Statistical Analysis of the Electronic Health Records, Health Administration Press, 2020.
  • Read Chapter 6, pages 135-152 in Big Data in Healthcare: Statistical Analysis of the Electronic Health Records, Health Administration Press, 2020.
  • Normal distribution  AI assisted Slides► AI assisted Video►  YouTube►
  • Standard Normal distribution & calculation of probabilities of events AI assisted Slides► AI assisted Video► Calculator►
  • Yili Lin on how to download R AI assisted Slides► AI assisted Video►
  • Yili Lin on how to read data in R AI assisted Slides► AI assisted Video►
  • Yili Lin on introduction to descriptive statistics in R AI assisted Slides► AI assisted Video►
  • Yili Lin on Data Types in R AI assisted Slides► AI assisted Video►
  • The following are non-Normal distributions that are being sampled.  As the sample size increases, the distribution of the average becomes more Normal.
   

Assignments

Assignments are submitted on Canvas.  They are graded as pass/fail.  A summary 1-page word document should be included.  In the summary, you should state if you were able to get the same answers as those provided. Your R, STATA, or Python code should be included in separate files. No late assignments are accepted.  It is OK to help each other in doing the assignments but not OK to copy and paste work of others.  It is OK to use ChatGPT or other large language models to generate the R code, but you must be transparent about it and report its use. Prompts to use with ChatGPT are also provided.

Question 1: A particular health related test  has a mean score of 500 and a standard deviation of 100.  In a sample of 30 students the mean test score was 525 and standard deviation was 75.  (a) Test that the sample comes from the population.  (b) draw the two distributions. (c)  Provide a confidence interval for the mean of the sample.

Resources for Question 1:

  • Answer on page 147 to 150 in the required textbook
  • AI-guided walk through the R solution AI assisted Prompt►
  • AI-guided walk through the Python solution AI assisted Prompt►
  • Yili Lin's Answer► R Code►
  • Sai Naga Akshar Gollapudi, Bessy Nicole Lovos Davila, and Maria Kurian Teach One►

Question 2: Assume that the average length of stay for individuals having cardiac by-pass surgery is normally distributed with a mean of 9 days and a standard deviation of 1.25 days.  What is the probability that a by-pass patient will have length of stay of 8 days or less.

Resources for Question 2:

Question 3: Calculate the probability of the following events for a distribution with mean 10 and standard deviation 5.

  1. Probability that the variable will have a value less than or equal to 12, P(X ≤ 12)
  2. Probability that the variable will have a value greater than or equal to 22, P(X ≥ 22)
  3. Probability that the variable will have a value greater than or equal to 2 and less than or equal to 12, P(2 ≤ X ≤ 12)

Resources for Question 3:

More