Lecture 5. Maps

PUBH 6199: Visualizing Data with R, Summer 2025

Xindi (Cindy) Hu, ScD

2025-06-17

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

Types of Maps for Spatial Data

1. Choropleth Map

  • Display the spatial distribution of a variable across divided geographical areas
  • Best for normalized data (e.g., rates)
  • Variable encoding: color (intensity or hue)

Example: Choropleth

library(tmap)
data("World")
tm_shape(World) + 
  tm_polygons("HPI", palette = "viridis", title = "Happy Planet Index")

2. Point Map

  • Each point represents a location
  • Can show locations of events, facilities, cases
  • Variable encoding: color, shape, size

Example: Point Map

library(tidyverse)
library(spData)
library(tmap)
data(urban_agglomerations)
urb_2030 <- urban_agglomerations |> filter(year == 2030) 
tm_shape(World) +
  tm_polygons() +
  tm_shape(urb_2030) +
  tm_symbols(fill = "black", col = "white", size = "population_millions",
             size.legend = tm_legend(title = "Urban Population in 2030\n(millions)"))

3. Heat Map (Density)

  • Visualizes concentration of events
  • Uses kernel density estimation or hex bins
  • Variable encoding: color (intensity)

Example: Heat map with leaflet.extras

library(leaflet)
library(leaflet.extras)
thefts_coords <- read_csv("data/thefts_coords.csv")
thefts_sf <- st_as_sf(thefts_coords, coords = c("lon", "lat"), crs = 4326)
leaflet(thefts_sf) %>%
  addProviderTiles("OpenStreetMap") %>%
  addHeatmap(blur = 15, max = 0.05, radius = 10) %>%
  setView(lng = -87.6298, lat = 41.8781, zoom = 11)

4. Faceted Map

  • Create small multiples to compare across categories
  • Great for time series or group comparisons

Example: Faceted Map

library(tidyverse)
library(spData)
library(tmap)
data(urban_agglomerations)
urb_1970_2030 <- urban_agglomerations |> filter(year %in% c(1970, 1990, 2010, 2030))
tm_shape(World) +
  tm_polygons() +
  tm_shape(urb_1970_2030) +
  tm_symbols(fill = "black", col = "white", size = "population_millions",
             size.legend = tm_legend(title = "Urban Population\n(millions)")) +
  tm_facets(by = "year", ncol = 2)

5. Interactive Map

  • Enable zooming, hovering, filtering
  • Useful for dashboards, web apps

Example: Interactive Map

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, scale, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

Color Choices for Maps

  • Sequential palettes for ordered values
  • Diverging palettes for above/below mean
  • Qualitative for categories

Source: Which color scale to use when visualizing data, by Lisa Charlottte Muth.

Categorical scales

  • Use distinct hues for different categories
  • Limit to no more than 7 hues

Source: Analyzing US Census Data, by Kyle Walker

Sequential Scales

  • Map value to color on a continuum, based on both intensity and hue
  • Use for ordered data (e.g., population, income)

Source: Analyzing US Census Data, by Kyle Walker

Diverging Scales

  • Use for data with a meaningful midpoint (e.g., above/below average)
  • Two contrasting colors with a neutral midpoint (e.g., white/light gray)

Source: 2020 U.S. Election Mapped, by Vivid Maps

Avoid Misleading Colors

  • Don’t use rainbow: not perceptually uniform
  • Consider accessibility (color-blind safe palettes)
  • Avoid encoding meaning with non-intuitive colors

Map Projections

  • A projection distorts shape, area, distance, or direction
  • Use equal-area projections for choropleths

Common Projections in R

Use Case Recommended Projection EPSG Code
Equal-area choropleths Albers Equal Area 5070
Interactive maps Web Mercator 3857
Global perspective Robinson or Winkel Tripel 54030 / 54042
Local detail (U.S.) NAD83 / State Plane varies

Use st_transform() to convert:

my_data <- sf::st_transform(my_data, crs = 5070)

In tmap:

  • In static mode: all layers are reprojected to match the first layer.
  • In interactive mode: all layers are projected to EPSG:3857.

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, scale, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

R Ecosystem for Mapping

  • Data handling: sf, sp
  • Thematic mapping: tmap, ggplot2, cartography
  • Basemaps & interactivity: leaflet, mapview, ggmap
  • Shapefiles: rgdal, rmapshaper
  • Data access: tigris, tidycensus

{sf}: simple features

The {sf} package is the standard way to work with vector spatial data in R. It replaces older tools like {sp} with a simple, tidy-friendly interface.

Key Features of {sf}

  • Stores geometry + attributes in a single data.frame-like object
  • Built on simple features standard (ISO 19125-1)
  • Fully compatible with dplyr, ggplot2, tmap
  • Uses sfc column to store spatial information (e.g., points, polygons)

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, scale, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

Static Mapping with tmap

  • Similar to ggplot2, based on “the grammar of graphics”
  • Supports both static and interactive modes
  • Excellent for quick, polished maps, sensible defaults
library(tmap)
tmap_mode("plot")
nc <- st_read("data/nc.shp", quiet = TRUE) # nc is an `sf` object
tm_shape(nc) + # defines input data
  tm_polygons("BIR74", palette = "Greens") + # mapping data to aesthetics
  tm_layout(title = "Births in NC, 1974")

How {tmap} works?

{tmap} adopts an intuitive approach to map-making: the addition operator + adds a new layer, followed by tm_*():

  • tm_fill(): shaded areas for (multi)polygons
  • tm_borders(): border outlines for (multi)polygons
  • tm_polygons(): both, shaded areas and border outlines for (multi)polygons
  • tm_lines(): lines for (multi)linestrings
  • tm_symbols(): symbols for (multi)points, (multi)linestrings, and (multi)polygons
  • tm_raster(): colored cells of raster data (there is also tm_rgb() for rasters with three layers)
  • tm_text(): text information for (multi)points, (multi)linestrings, and (multi)polygons

Adding layers in {tmap}

  • tm_polygons(): for choropleth maps
  • tm_symbols(): for point data, size and color can represent different variables
# Create the map: choropleth + bubbles
tm_shape(nc) +
  tm_polygons("BIR74", palette = "brewer.blues", title = "Births in 1974") +
  tm_symbols(size = "SID74", col = "red", alpha = 0.5, border.col = "white",
             title.size = "SID Cases (1974)") 

Scale

Scales control how the values are represented on the map and in the legend, and can have a major impact on how spatial variability is portrayed

tm_shape(nz) + tm_polygons(fill = "Median_income")
tm_shape(nz) + tm_polygons(fill = "Median_income",
                        fill.scale = tm_scale(breaks = c(0, 30000, 40000, 50000)))
tm_shape(nz) + tm_polygons(fill = "Median_income",
                           fill.scale = tm_scale(n = 10))
tm_shape(nz) + tm_polygons(fill = "Median_income",
                           fill.scale = tm_scale(values = "BuGn"))

Style options for classifying map data

tm_scale_intervals(style = "pretty"):

  • “pretty”: Rounded, evenly spaced breaks (default).
  • “equal”: Equal-width bins; poor fit for skewed data — may hide variation.
  • “quantile”: Equal count per bin; be careful with wide bin ranges.
  • “jenks”: Optimizes natural groupings; can be slow with large datasets.
  • “log10_pretty”: Log-scaled breaks; only appropriate for right-skewed, positive values.

Switch to interactive {tmap}

A unique feature of {tmap} is its ability to create static and interactive maps using the same code. Maps can be viewed interactively at any point by switching to view mode, using the command tmap_mode("view")

tmap_mode("view")
tm_shape(nc) + 
  tm_polygons("BIR74", palette = "brewer.reds")

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, scale, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

Interactive Mapping with {leaflet} in R

  • {leaflet} is the most widely used interactive mapping package in R.
  • It provides a relatively low-level interface to the Leaflet.js JavaScript library leafletjs.com.
  • Maps start with leaflet() and use pipeable layers like addTiles(), addCircles(), and addPolygons().

Example leaflet Map

pal <- colorNumeric("RdYlBu", domain = cycle_hire$nbikes)
# cycle_hire is an `sf` object with columns: name, nbikes, geometry, built-in data from spData
leaflet(cycle_hire) |>
  addProviderTiles(providers$CartoDB.Positron) |>
  addCircles(col = ~pal(nbikes), opacity = 0.9) |>
  # lnd is a London boroughs shapefile
  addPolygons(data = lnd, fill = FALSE) |> 
  addLegend(pal = pal, values = ~nbikes) |>
  setView(lng = -0.1, lat = 51.5, zoom = 12) |>
  addMiniMap()

Outline for today

  • Types of maps for spatial data
  • Map design considerations (color, scale, projection, legend)
  • R packages for mapping
  • Static maps with tmap
  • Interactive maps with leaflet
  • Accessing spatial data with tigris and tidycensus

Getting Data with tigris

The {tigris} package provides access to U.S. Census Bureau geographic data. Shapefiles downloaded using {tigris} will be loaded as a simple features (sf) object with geometries.

  • A shapefile is a vector data file format commonly used for geospatial analysis.

  • Shapefiles contain information for spatially describing features (e.g. points, lines, polygons), as well as any associated attribute information.

  • You can find / download shapefiles online (e.g. from the US Census Bureau), or depending on the tools available, access them via packages (like we’re doing today).

Getting U.S. County Shapefiles

Entire US

library(tigris)
library(sf)
counties <- counties(state = NULL, cb = TRUE, progress_bar = FALSE) 
# Use `cb = TRUE` for simplified geometries
glimpse(counties)
Rows: 3,235
Columns: 13
$ STATEFP    <chr> "01", "01", "01", "10", "01", "01", "04", "05", "05", "05",…
$ COUNTYFP   <chr> "069", "023", "113", "005", "071", "089", "015", "017", "12…
$ COUNTYNS   <chr> "00161560", "00161537", "00161583", "00217269", "00161561",…
$ GEOIDFQ    <chr> "0500000US01069", "0500000US01023", "0500000US01113", "0500…
$ GEOID      <chr> "01069", "01023", "01113", "10005", "01071", "01089", "0401…
$ NAME       <chr> "Houston", "Choctaw", "Russell", "Sussex", "Jackson", "Madi…
$ NAMELSAD   <chr> "Houston County", "Choctaw County", "Russell County", "Suss…
$ STUSPS     <chr> "AL", "AL", "AL", "DE", "AL", "AL", "AZ", "AR", "AR", "AR",…
$ STATE_NAME <chr> "Alabama", "Alabama", "Alabama", "Delaware", "Alabama", "Al…
$ LSAD       <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",…
$ ALAND      <dbl> 1501742250, 2365900084, 1660653961, 2424590442, 2792044612,…
$ AWATER     <dbl> 4795418, 19114321, 15562947, 674129051, 126334711, 28756353…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-85.71209 3..., MULTIPOLYGON (…

One state

library(tigris)
library(sf)
counties_md <- counties(state = "Maryland", cb = TRUE, progress_bar = FALSE) 
# Use `cb = TRUE` for simplified geometries
glimpse(counties_md)
Rows: 24
Columns: 13
$ STATEFP    <chr> "24", "24", "24", "24", "24", "24", "24", "24", "24", "24",…
$ COUNTYFP   <chr> "005", "019", "017", "015", "041", "037", "039", "011", "03…
$ COUNTYNS   <chr> "01695314", "00596495", "01676992", "00596115", "00592947",…
$ GEOIDFQ    <chr> "0500000US24005", "0500000US24019", "0500000US24017", "0500…
$ GEOID      <chr> "24005", "24019", "24017", "24015", "24041", "24037", "2403…
$ NAME       <chr> "Baltimore", "Dorchester", "Charles", "Cecil", "Talbot", "S…
$ NAMELSAD   <chr> "Baltimore County", "Dorchester County", "Charles County", …
$ STUSPS     <chr> "MD", "MD", "MD", "MD", "MD", "MD", "MD", "MD", "MD", "MD",…
$ STATE_NAME <chr> "Maryland", "Maryland", "Maryland", "Maryland", "Maryland",…
$ LSAD       <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06",…
$ ALAND      <dbl> 1549740649, 1400573750, 1185745253, 896912542, 695556093, 9…
$ AWATER     <dbl> 215957832, 1145353067, 479451413, 185281256, 539369000, 105…
$ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-76.3257 39..., MULTIPOLYGON (…

Getting Census Data with tidycensus

library(tidycensus)
invisible(
  census_api_key(Sys.getenv("CENSUS_API_KEY"), install = TRUE, overwrite = TRUE)
)
options(tigris_use_cache = TRUE)
income_md <- get_acs(geography = "county", 
  state = "MD",
  variables = "B19013_001", 
  geometry = FALSE,
  show_progress = FALSE)

Lab 5 has a tutorial on how to get started with {tidycensus} packages. Follow it carefully. Remember to add your .Renviron to .gitignore so that you do not share your API keys.

Plotting Census Data

tmap_mode("plot")
income_md <- counties_md |> 
  left_join(income_md, by = c("GEOID" = "GEOID")) # join the income data to the counties
tm_shape(income_md) + 
  tm_polygons("estimate", palette = "Blues", title = "Median Income")



Your turn in HW 5:

  • Choose a U.S. state

  • Download county shapefiles with tigris or tidycensus

  • Plot a choropleth using tmap

  • Add labels and legends

15:00

Summary

  • Choose the right map for the data and audience
  • Make thoughtful color and projection choices
  • Use tmap for quick static/interactive maps
  • Use leaflet for rich interactivity
  • Access geographic data via tigris and tidycensus

End-of-Class Survey




Fill out the end-of-class survey

~ This is the end of Lecture 5 ~

10:00