Good Viz/Bad Viz

To the Slack!

Data Wrangling

ggplot expects tidy data, data that is structured such that

  • Each variable has its own column
  • Each observation has its own row
  • Each value has its own cell
Wickham and Grolemund Ch 12

Wickham and Grolemund Ch 12

Pivot

pivot_longer: Convert wide data to long, or move variable values out of the column names and into the cells.

pivot_longer(df, cols = -country, names_to = "year", values_to = "cases")

pivot_wider: Convert long data to wide, or move variable names out of the cells and into the column names.

pivot_wider(df, id_cols = country, names_from = type, values_from = count)

Maps

Projections and CRS

Projection: “The challenge in map-making is that we need to take the spherical surface of the earth and flatten it out so we can display it on a map.”

  • EPSG (European Petroleum Survey Group) and ESRI (Environmental Systems Research Institute) maintain registries of projections. Several sites provide convenient access to registered projections, including http://spatialreference.org/ and https://epsg.io/.

  • Coordinates can only be placed on the Earth’s surface when their coordinate reference system (CRS) is known. The CRS includes both the projection and extent of the map. ggplot will use the coordinate system of the data

In R, sf

  • We can use coord_sf to change the CRS when plotting
  • Or use st_transform to convert data frames to a different CRS

Simple Features

SF represents simple features as records in a data.frame or tibble with a geometry list-column. To learn more: r-spatial.github.io/sf

The simple features are in the way geometry is represented. The geometry types include:

Some notes on installation

Bonus Material: Leaflet

There’s a good chance we won’t get to this, but Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. RStudio has created a package to integrate and generate Leaflet maps in R.

Today’s script has some examples meant to walk you though a few key bits.

Projects!

The final projects should be created in R markdown files so that the knitted version can be posted on the class website.

Details

This is a chance to show off some new tools and capacities to make a truthful, functional, beautiful, and insightful narrative visualizations (a la Cairo) that enacts sound data visualization principles (a la Schwabish).

The goal, then, is to tell a story with the data you’re using, so the R markdown (and knitted html) file should introduce, frame, and describe your visual story. You can use as many figures as you want, but there must be at least three different figures (of three different types – not just three scatterplots or three maps). The narrative should provide some background outlining your questions and explaining the data you are using (including providing a citation and source for the data), then provide an explanation and interpretation of each figure (the code should also be available, but hidden with code folding).

Things we should be looking for (including in providing feedback to one another next week): do we understand the visuals/story, is it inadvertently misleading, is it aesthetically pleasing, are there simple things you could do to make it nicer (flip it, alter text size, choose better colors, add non-default labels, annotate, map another feature, pair figures for easier comparison, etc.), does it teach us anything, and (ultimately, but perhaps not by next week) do you provide enough narrative and explanation that someone unfamiliar with your topic could read the file and follow, possibly even learn something?

What you’ll turn in for the final project:

  • A compressed/zipped folder that contains
  • the data as you originally acquired it
  • any R script you wrote to prepare the data for analysis, and the resulting cleaned data (if appropriate)
  • the R markdown file that constitutes the primary deliverable
  • a knitted (.html) version of the R markdown file

Check out last year’s projects!

More R Markdown

R Markdown file options

  • A YAML header (yet-another-markup-language), offset by —-
    • Title/subtitle, author, date
    • (Floating) Table of contents and code folding
title: "An incredible project"
subtitle: "Of great significance"
author: "Don't you know who I am?"
date: "2022-09-14"
output:
  html_document:
    toc: true
    toc_depth: 4
    toc_float: true
    code_folding: hide
  • A look at R Markdown themes

  • R code chunk options (ctrl-alt-i)

    • name (to identify code chunk)
    • echo = TRUE (show code in document)
    • message = FALSE, warning = FALSE (prevent messages, warnings from appearing)
    • fig.height = X, fig.width = X, fig.caption = "caption below the figure" (control size of generated figure, presence of figure caption)
    • for example
```{r codechunkname, warning = FALSE, fig.height = 10}
# add code here
```
  • knitr::opts_chunk$set() (set global options, usually in first code chunk)

XKCD Inspiration

XKCD, Randall Munroe, https://xkcd.com/1138/