Assigned Reading
- Effectiveness of antidepressants
PubMed►
- Proxy measure for remission of depression symptoms
PubMed►
Assignment
The semester long project in this course is to assess the
effectiveness of an existing guide to depression medications in minority
populations.
(A) Register for All of US. This step was assigned
prior to start of the course.
If you have not done all registration steps, including training, then
you will need to solve this problem quickly. This registration may take
several weeks for students who do not have a State ID. Otherwise, it
should take about 90 minutes. Also make sure that you remember your
password as there are multiple accounts set up in this process. You
need to write down the password for each sign in separately on a piece of
paper as you may confuse which password is needed when.
- Register for an account on @researchallofus.org
- Change from temporary password to a new password and record your password on paper somewhere.
- Turn on Google 2-Step Verification
- Verify your identity with Login.gov. This step requires a state ID or Drivers License, and text phone.
- There are multiple passwords that you should keep in mind. There is your GMU password, your research workbench password on All of
Us and your computer password, and your Google password. Please make sure that you keep these accounts separate and read the messages
carefully to see which password is needed.
- Complete All of Us Registered Tier Training
- You do not need to get additional data access beyond registration data. George Mason University does not allow access to Controlled Tier.
- Sign the Code of Conduct Sign Data User Code of Conduct
When you have registered completely, you should see something like this
page:
(B) Create Cohort and Related Data Sets. Note that a
cohort and data sets are different concepts.
- Create your cohort in All of Us.
- Limit the cohort by African American race.
- Create the concept for Major depression. Review in PubMed how
investigators have defined Major depression in EHRs. Alternatively, use conditions defined within All of Us to select
the right definition of Major Depression.
- Create the concept of patient's survival.
- The unit of analysis is medications and not individuals. An
individual can have multiple medications. Define the database so
that there is one entry for each antidepressant.
- Create your data sets, for your cohort. Do not include non-EHR data or surveys.
Note that creation of antidepressant data set requires creation of concepts that capture the
antidepressant in the data. In your cohort, select demographics (age, gender)
and all conditions as independent variables of interest. No survey responses are needed for independent variables. Rely only
on EHR data only. Include date of occurrence of every event. You also need the date of
first use (purchase) of the antidepressant. The date of occurrence of the
response variable is the first time the variable/condition has occurred. Here are the
data points that you need to include in your data sets:
- ID of antidepressant
- ID of person
- Age at first intake of
antidepressant
- Sex at birth
- Gender
- Survival
- 590 Diseases among the Conditions.
Here are more detailed steps in getting ready for analysis:
- Get the dataset for patient demographics to include date of birth,
race, and ethnicity.
- Select African Americans.
- Create the base of df_analysis from this.
- Get date_of_death from the dataset containing dead persons
then left join to df_analysis.
- Get date_of_first_antidepressant from dataset containing all of
your cancers then left join to df_analysis.
- Get date of every antidepressant purchase
- Process disease dataset
- Get list of all of your antidepressants codes. The data set should
not be limited to the antidepressant you selected and should include
all antidepressants.
- Create a new column for the start date of antidepressant you
selected.
- Create a new column for the end date of antidepressant you
selected.
- Create a new column for duration of any antidepressant used prior
to the antidepressant you selected.
- Create a new column disease_group
- Use the df_disease_grouped.csv to fill the disease_group
column based on standard_concept_code
- Change missing values of disease_group to zero, 0 (catch all
disease grouping). This assumes that unreported diseases are
absent.
- Select all diseases that occur prior to date of the
antidepressants
- Calculate number of days of use of antidepressants and score if
antidepressant was prematurely abandoned.
- Binarize the disease_group column. No need to drop any
column since this is not mutually exclusive, meaning a person can have
many disease groups thus avoiding the dummy variable trap.
- Aggregate based on antidepressant-id so that only 1 row per
antidepressant per person_id is
in the dataset and the binarized disease group columns indicate all
the disease groups that the person has.
- Drop all other columns except antidepressant_id, person_id, days
of antidepressant use, and the binarized
columns.
- Left join the binarized columns to df_analysis
- You are now ready to start description of the data
The following resources may be of use in this task:
- Organizing antidepressants csv►
- Organizing conditions csv►
- Creating survival variable
More►
(C) Describe the Population. In this step you need to create Table 1 in your eventual report. This Table should
include the description of the population. For examples of Table 1 see PubMed. Provide a summary of your data that includes number of
antidepressants examined, number of individuals involved, number of
antidepressants discontinued, number of days individuals followed, number
of days antidepressants continued, number of medical conditions at
baseline of use of antidepressants, number of antidepressants used prior
to baseline, experience with previous antidepressants.
(D) Calculate the AI's Recommendation: Using
published data, score the probability of remission for each of the 15
antidepressants and select recommended antidepressants (up to 3
antidepressants within 0.05 points of the highest score).
(E) Calculate observed remission:
For each antidepressant, calculate the rate of premature discontinuation
as a surrogate measure of remission.
(F) Check AI's Accuracy: Compare observed rates to
predicted rates.
(G) Report Your findings: This report should include the
following section and provided at approximate times indicated by email to
the instructor:
- Abstract. Include a structured abstract using objective of
the study, method, results, and main conclusion. The abstract
should be written after you complete other sections. The
abstract must not exceed 500 words and should report the number of
words used in the abstract.
- Background literature review should not exceed 1 page. Your one
page literature review should assume a reader familiar with the
literature and not exceed three paragraph. The first paragraph should
address the significance of the area you are addressing, including
prevalence of depression and importance of selection of
antidepressants. The second paragraph should describe failure of
clinicians in selecting the right antidepressant for African
Americans, as reported in the literature. The paragraph should not exceed two or three
sentences but can have numerous references. The last paragraph should
discuss how your analysis can help selection of antidepressants for
African Americans. Background
section should be a brief synthesis of existing research findings related to the problem being addressed in the study.
Every sentence should have a reference. We are not
interested in unsupported claims.
- Method section should be a complete description of the
methods; and there is no page limit but brevity is appreciated. It should include a paragraph or a sentence on source of data.
It should describe the inclusion and exclusion criteria for the
creation of the cohort and compare these criteria to what has been
done in the literature. It should have a sentence or a paragraph, with
citations, on definition of remission. It
should have a sentence or a paragraph on number of, and definition of,
independent variables. These statements should clarify how missing values were
treated and explain what steps were taken to ensure that independent
variables occur prior to response/dependent variable. There
should be a paragraph on analytical methods used.
- Results section should describe the findings and there is no page
limit. Table 1 should be description of the
population studied. Figures and additional tables should
summarize the statistical findings. These should include parameters of
your model and the fit between the guide and experience of African
Americans. There should not be any discussion of findings in the
result section.
- Discussion section should include 4 distinct sections and there is
no page limits. The first section should be a summary of the key findings.
The second section should be a review of support for the findings in
the literature. The third section should summarize study limitations.
The last section should conclude with policy implications.
Example Completed Assignments: The following listed projects use patient's medical history to screen for
indicated disease.
- Redd's lung cancer paper
Read► (Use instructor's last name for password)
- All of Us breast cancer study
YouTube►
This page is part of the HAP 819 course organized by Farrokh Alemi, Ph.D.
Home► Email►
|