Next Steps
CS&SS 508 • Lecture 10
Jess Kunke (slides adapted from Victoria Sass)
You’ve already learned SO MUCH in this class:
But if grad school teaches us anything, it’s that the more we learn, the more we realize how much more there is to learn. 🥴😑🫠
This can be freeing! And even fun?! Let your curiosity run wild!
Today, we’ll look at some of the ways you can extend your learning beyond the scope of this introductory course.
tidymodels
tidymodels
packagestidymodels
approachtidymodels
frameworktidymodels
resourcestidymodels
has extensive documentation on all core tidymodels
packages as well as related ones.tidymodels
.ggmap
There are numerous ways to work with spatial data in R
but you can use ggmap
to visualize spatial data in a tidyverse framework.
1source("stadia_api_key.R")
library(ggmap)
2`%notin%` <- function(lhs, rhs) !(lhs %in% rhs)
3violent_crimes <- crime |>
filter(offense %notin% c("auto theft", "theft", "burglary"),
between(lon, -95.39681, -95.34188),
between(lat, 29.73631, 29.78400)) |>
mutate(offense = fct_drop(offense),
offense = fct_relevel(offense,
c("robbery", "aggravated assault", "rape", "murder")))
R
script simply saves my API key and registers it in the current session so it’s not included in my code that’s accessible on GitHub.
crime
is a built-in dataset in the ggmap
package.
Once we have data we want to visualize we can call ggmap
to visualize the spatial area and layer on any geoms/stats as you would with ggplot2.
bbox <- make_bbox(lon, lat,
4 data = violent_crimes)
map <- get_stadiamap( bbox = bbox,
maptype = "stamen_toner_lite",
5 zoom = 14 )
ggmap(map) +
geom_point(data = violent_crimes,
6 color = "red")
ggmap
is that (1) you need to specify the data arguments in the layers and (2) the spatial aesthetics x
and y
are set to lon
and lat
, respectively. (If they’re named something different in your dataset, just put mapping = aes(x = longitude, y = latitude)
, for example.)
With ggmap
you’re working with ggplot2
, so you can add in other kinds of layers, use patchwork
, etc. All the ggplot2
geom’s are available.
library(patchwork)
library(ggdensity)
library(geomtextpath)
robberies <- violent_crimes |> filter(offense == "robbery")
points_map <- ggmap(map) + geom_point(data = robberies, color = "red")
hdr_map <- ggmap(map) +
geom_hdr(aes(lon, lat, fill = after_stat(probs)),
data = robberies,
alpha = .5) +
geom_labeldensity2d(aes(lon, lat, level = after_stat(probs)),
data = robberies,
stat = "hdr_lines",
size = 3, boxcolour = NA) +
scale_fill_brewer(palette = "YlOrRd") +
theme(legend.position = "none")
(points_map + hdr_map) &
theme(axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank())
library(ggExtra)
data(mpg, package = "ggplot2")
mpg_select <- mpg |>
filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) +
geom_count() +
geom_smooth(method = "lm", se = F) +
theme_bw()
7ggMarginal(g, type = "histogram", fill = "transparent")
library(ggExtra)
data(mpg, package = "ggplot2")
mpg_select <- mpg |>
filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) +
geom_count() +
geom_smooth(method = "lm", se = F) +
theme_bw()
7ggMarginal(g, type = "boxplot", fill = "transparent")
library(ggExtra)
data(mpg, package = "ggplot2")
mpg_select <- mpg |>
filter(hwy >= 35 & cty > 27)
g <- ggplot(mpg, aes(cty, hwy)) +
geom_count() +
geom_smooth(method = "lm", se = F) +
theme_bw()
7ggMarginal(g, type = "density", fill = "transparent")
library(gapminder)
library(gganimate)
8library(gifski)
ggplot(gapminder, aes(gdpPercap, lifeExp, size = pop, colour = country)) +
geom_point(alpha = 0.7, show.legend = FALSE) +
scale_colour_manual(values = country_colors) +
scale_size(range = c(2, 12)) +
scale_x_log10() +
facet_wrap(~continent) +
9 labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'life expectancy') +
transition_time(year) +
ease_aes('linear')
gganimate
output.
gganimate
-specific code.
ggplot2
: Elegant Graphics for Data Analysis (Second Edition) is available through the UW Library and the forthcoming Third Edition is being written as we speak and will be available online soon!R
and gain more understanding about best practices for conveying your data and findings effectively.sf
package also works within the tidyverse framework and pairs very nicely with data from the census which can be easily accessed using tidycensus
mapview
and tmap
packages (among others) for interactive and beautiful maps!cowplot
for side-by-side maps with shared legend, latex2exp
for plot label formatting, egg
for ggarrangeShiny is an open source R package that provides an elegant and powerful web framework for building web applications using R. Shiny helps you turn your analyses into interactive web applications without requiring HTML, CSS, or JavaScript knowledge.
Version control allows you to work work individually and/or collaboratively in a highly structured, documented way.
It’s basically like a robust save program for your project. You track and log changes you make over time and the version control system allows you to review or even restore earlier versions of your project.
Originally meant for software developers, git has been adopted by computational social scientists to source code but also to keep track of the whole collection of files that make up a research project.
The beepr
package works on Mac, Windows, and Linux:
Here are a few other things I’ve found handy in my work:
igraph
(also available for python and other languages) and statnet
for working with networkstictoc
, rbenchmark
, microbenchmark
, profvis
, and more for timing your code and identifying bottlenecksMatrix
for efficienct sparse matrix operationsCSSCR (The Center for Social Science Computation and Research) is a resource center for the social science departments1 at the University of Washington.
As you continue to learn R
feel free to drop by2 with any/all of your R
coding questions.
Thanks for spending so much time this quarter learning with me 😎
Don’t forget to fill out the course evaluation that you received via email!