library(ggplot2)
library(dplyr)
Homework 1
Before your start the assignment
- Update the header - put your name in the
author
argument and put today’s date in thedate
argument. - Click the “Render” button in RStudio and then open the rendered
1-hw1.html
page. - Then go back and try changing the
theme
argument in the header to something else - you can see other available themes here. Notice the difference when you render now!
Overview of HW1
In this assignment, you will explore the midwest
dataset using ggplot2
to uncover patterns in population, education, and poverty across counties in the U.S. Midwest.
Skills practiced:
- Exploratory data visualization
- Mapping variables to aesthetics (
aes
) - Practice different components of
ggplot2
includinggeom_*()
,scale_*()
,facet_wrap()
- Draw public health insights from data
Load Required Packages
Load the Data
data(midwest)
# tip: type ?midwest in your console to see the data dictionary
Q1: Explore the dataset
Using glimpse()
or summary()
to explore the structure of the dataset. What are the variables in the dataset? What interesting public health questions can you ask based on the variables in the dataset? Write down at least 3 questions you can ask based on the variables in the dataset:
- Research question 1:
- Research question 2:
- Research question 3:
#your code here
Q2: Population vs. poverty
Create a scatterplot with:
poptotal
on the x-axispercbelowpoverty
on the y-axis- Color the points by
state
#your code here
What do you notice about the relationship between county population size and poverty? Do some states stand out?
Type your response here.
Q3: Education vs. poverty
- Make a scatterplot of
percollege
(x) vspercbelowpoverty
(y). - Add a smoother using
geom_smooth()
.
# your code here
What does this tell you about the relationship between education and poverty?
Type your response here.
Q4: Facet by metro/non-metro
- Recreate the scatterplot of
percollege
(x) vspercbelowpoverty
(y). - Subset the data by metro and non-metro counties using
facet_wrap()
.
# your code here
How does the relationship between education and poverty differ between metro and non-metro counties?
Type your response here.
Q5: Compare poverty rates by state
# your code here
What differences do you observe in the poverty rate distributions across states? Which states appear to have more consistent or more variable poverty rates?
Type your response here.
Q6: Make a visualization to answer one of the research questions you wrote down for Q1.
Type your research question here.
# your code here
Type your answer to your research question based on the visualization here.
BONUS: Improve your visualization and record the evolution of your work
- Read the blogpost by Cedric Scherer here
- Take your visualization from Q6 and improve it. You can change the geometric object, color scheme, add labels, or any other changes you think will make it better.
- Record the evolution of your work using the
{camcorder}
package and save a gif calledmy-ggplot-evol.gif
.
#your code here
Insert the gif here: