library(ggplot2)
library(dplyr)Homework 1
Before your start the assignment
- Update the header - put your name in the
authorargument and put today’s date in thedateargument. - Click the “Render” button in RStudio and then open the rendered
1-hw1.htmlpage. - Then go back and try changing the
themeargument in the header to something else - you can see other available themes here. Notice the difference when you render now!
Overview of HW1
In this assignment, you will explore the midwest dataset using ggplot2 to uncover patterns in population, education, and poverty across counties in the U.S. Midwest.
Skills practiced:
- Exploratory data visualization
- Mapping variables to aesthetics (
aes) - Practice different components of
ggplot2includinggeom_*(),scale_*(),facet_wrap() - Draw public health insights from data
Load Required Packages
Load the Data
data(midwest)
# tip: type ?midwest in your console to see the data dictionaryQ1: Explore the dataset
Using glimpse() or summary() to explore the structure of the dataset. What are the variables in the dataset? What interesting public health questions can you ask based on the variables in the dataset? Write down at least 3 questions you can ask based on the variables in the dataset:
- Research question 1:
- Research question 2:
- Research question 3:
#your code hereQ2: Population vs. poverty
Create a scatterplot with:
poptotalon the x-axispercbelowpovertyon the y-axis- Color the points by
state
#your code hereWhat do you notice about the relationship between county population size and poverty? Do some states stand out?
Type your response here.
Q3: Education vs. poverty
- Make a scatterplot of
percollege(x) vspercbelowpoverty(y). - Add a smoother using
geom_smooth().
# your code hereWhat does this tell you about the relationship between education and poverty?
Type your response here.
Q4: Facet by metro/non-metro
- Recreate the scatterplot of
percollege(x) vspercbelowpoverty(y). - Subset the data by metro and non-metro counties using
facet_wrap().
# your code hereHow does the relationship between education and poverty differ between metro and non-metro counties?
Type your response here.
Q5: Compare poverty rates by state
# your code hereWhat differences do you observe in the poverty rate distributions across states? Which states appear to have more consistent or more variable poverty rates?
Type your response here.
Q6: Make a visualization to answer one of the research questions you wrote down for Q1.
Type your research question here.
# your code hereType your answer to your research question based on the visualization here.
BONUS: Improve your visualization and record the evolution of your work
- Read the blogpost by Cedric Scherer here
- Take your visualization from Q6 and improve it. You can change the geometric object, color scheme, add labels, or any other changes you think will make it better.
- Record the evolution of your work using the
{camcorder}package and save a gif calledmy-ggplot-evol.gif.
#your code hereInsert the gif here:
