Data Viz in R: Week 4
To the Slack!
ggplot expects tidy data, data that is structured such that
pivot_longer
: Convert wide data to long, or move variable values out of the column names and into the cells.
pivot_longer(df, cols = -country, names_to = "year", values_to = "cases")
pivot_wider
: Convert long data to wide, or move variable names out of the cells and into the column names.
pivot_wider(df, id_cols = country, names_from = type, values_from = count)
Projection: “The challenge in map-making is that we need to take the spherical surface of the earth and flatten it out so we can display it on a map.”
EPSG (European Petroleum Survey Group) and ESRI (Environmental Systems Research Institute) maintain registries of projections. Several sites provide convenient access to registered projections, including http://spatialreference.org/ and https://epsg.io/.
Coordinates can only be placed on the Earth’s surface when their coordinate reference system (CRS) is known. The CRS includes both the projection and extent of the map. ggplot will use the coordinate system of the data
In R, sf
coord_sf
to change the CRS when plottingst_transform
to convert data frames to a different CRSSF represents simple features as records in a data.frame or tibble with a geometry list-column. To learn more: r-spatial.github.io/sf
The simple features are in the way geometry is represented. The geometry types include:
Some notes on installation
There’s a good chance we won’t get to this, but Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. RStudio has created a package to integrate and generate Leaflet maps in R.
Today’s script has some examples meant to walk you though a few key bits.
The final projects should be created in R markdown files so that the knitted version can be posted on the class website.
This is a chance to show off some new tools and capacities to make a truthful, functional, beautiful, and insightful narrative visualizations (a la Cairo) that enacts sound data visualization principles (a la Schwabish).
The goal, then, is to tell a story with the data you’re using, so the R markdown (and knitted html) file should introduce, frame, and describe your visual story. You can use as many figures as you want, but there must be at least three different figures (of three different types – not just three scatterplots or three maps). The narrative should provide some background outlining your questions and explaining the data you are using (including providing a citation and source for the data), then provide an explanation and interpretation of each figure (the code should also be available, but hidden with code folding).
Things we should be looking for (including in providing feedback to one another next week): do we understand the visuals/story, is it inadvertently misleading, is it aesthetically pleasing, are there simple things you could do to make it nicer (flip it, alter text size, choose better colors, add non-default labels, annotate, map another feature, pair figures for easier comparison, etc.), does it teach us anything, and (ultimately, but perhaps not by next week) do you provide enough narrative and explanation that someone unfamiliar with your topic could read the file and follow, possibly even learn something?
What you’ll turn in for the final project:
R Markdown file options
title: "An incredible project"
subtitle: "Of great significance"
author: "Don't you know who I am?"
date: "2022-09-14"
output:
html_document:
toc: true
toc_depth: 4
toc_float: true
code_folding: hide
A look at R Markdown themes
R code chunk options (ctrl-alt-i)
name
(to identify code chunk)echo = TRUE
(show code in document)message = FALSE
, warning = FALSE
(prevent messages, warnings from appearing)fig.height = X, fig.width = X, fig.caption = "caption below the figure"
(control size of generated figure, presence of figure caption)```{r codechunkname, warning = FALSE, fig.height = 10}
# add code here
```
knitr::opts_chunk$set()
(set global options, usually in first code chunk)