Mapping in R

morals

Map making — the art of cartography — is an ancient skill that involves communication, intuition, and an element of creativity. —Robin Lovelace, Jakub Nowosad, Jannes Muenchow, Geocomputation with R

Without Geography, You’re Nowhere

getting data

getting your data

To create maps in R, you’ll need to have geographic data to plot.

Those could come from shapefiles.

A shapefile will typically come to you as a zip file with a few files contained inside including at least some files with the .shp, .shx, .dbf extensions.

Another format that geographic data could come to you in is geojson, which will look something like this:

{
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [125.6, 10.1]
  },
  "properties": {
    "name": "Dinagat Islands"
  }
}

shapefiles from administrative websites

Often you can find shapefiles on administrative websites. For example, the Massachusetts government has MassGIS: https://www.mass.gov/info-details/massgis-data-layers

Other sources could include:

  • the EPA
  • the Census
  • USGS
  • NASA
  • ad-hoc sources

tigris

tigris is an R package that allows users to directly download and use TIGER/Line shapefiles

install.packages('tigris')

library(tigris)
library(ggplot2)

manhattan_roads <- roads("NY", "New York")

ggplot(manhattan_roads) + 
  geom_sf() + 
  theme_void()

manhattan roads

sf

library(tidyverse)
library(sf)

# you might have some data in a shapefile -- 
# data/shapefile_unzipped/
#  -- cb_2013_us_county_20m.dbf
#  -- cb_2013_us_county_20m.prj
#  -- cb_2013_us_county_20m.shp
#  -- cb_2013_us_county_20m.shp.iso.xml
#  -- cb_2013_us_county_20m.shp.xml
#  -- cb_2013_us_county_20m.shx
#  -- county_20m.ea.iso.xml

counties <- read_sf(here("data/shapefile_unzipped"))

sf

the payoff is that then you can use geom_sf() with ggplot2

ggplot(counties) + 
  geom_sf()

fixing a problem

counties <- st_transform(counties, crs = "EPSG:5070")

ggplot(counties) + 
  geom_sf()

or even better

library(tigris)

# the default is to move Alaska, Hawaii
counties <- tigris::shift_geometry(counties) 

ggplot(counties) + 
  geom_sf()

# let's get a more interesting dataset
library(tidycensus)

popsize_by_counties <- tidycensus::get_acs(
  year = 2020,
  geography = 'county',
  variables = "B01001_001", # total population size
  geometry = TRUE
)

popsize_by_counties <- tigris::shift_geometry(popsize_by_counties)
popsize_by_counties <- st_simplify(popsize_by_counties, dTolerance = 500)

ggplot(popsize_by_counties, aes(fill = estimate)) + 
  geom_sf()

keep improving

ggplot(popsize_by_counties, aes(fill = estimate)) + 
  geom_sf() + 
  scale_fill_continuous(trans = scales::log10_trans(),
                        labels = scales::comma_format()) + 
  ggtitle("Population Size by County")

keep improving

ggplot(popsize_by_counties, aes(fill = estimate)) + 
  geom_sf(color = 'white', linewidth = 0.01) + 
  scale_fill_distiller(trans = scales::log10_trans(),
                    labels = scales::comma_format(),
                    direction = 1) + 
  ggtitle("Population Size by County")

mapview

library(mapview)
mapview(popsize_by_counties, zcol = 'estimate')

terra

using the terra package, you can work with raster data.

library(terra)
# install.packages('spDataLarge', repos='https://nowosad.github.io/drat/', type='source')
library(spDataLarge)

topography_raster <- rast(system.file("raster/srtm.tif", package = "spDataLarge"))

plot(topography_raster)

takeaways

  • working with spatial data is something you can do in R
  • more complicated geospatial operations are possible, but more advanced and outside the scope of this course
  • creating maps with mapview or geom_sf is relatively straightforward

references