= n_obs_by_year_cont %>%
obs_by_year_cont_wider pivot_wider(
# get the names for the new columns from the continent column
names_from = continent,
# get the values for the new columns from the n.obs column
values_from = n.obs
)
Homework 5
Instructions
Homework 5 has been posted on Canvas, and just as with Homework 4, you can download a Quarto document from the Homework 5 Canvas assignment that has the homework instructions/problems and spaces for you to add/modify text and code. You’ll upload that qmd file and its rendered html file as your homework submission. I’ve also posted the instructions here, and if you prefer, you can create a new Quarto document from scratch and answer these problems in it instead of the one provided on the Homework 5 Canvas assignment. In that case, you will submit that qmd file and the html file you get when you render it.
- Modify this document to complete your homework for Homework 5.
- Post questions on Ed Discussion, discuss with each other, and get as far as you can on these problems.
- If by the time the homework is due you have worked on a problem but did’t fully complete it, feel free to leave your work so far and write about where you’re stuck or what is going wrong or what your question is, and your peer evaluator might have some input.
- Feel free to change the format of this document in ways that you think make the output look nicer or easier to read, but keep the general structure (sections, questions, etc) so that your peer reviewer knows which part is which.
Section A: Pivoting with the gapminder
dataset
- We saw in lecture how to pivot
n_obs_by_year_cont
wider:
Write code to pivot obs_by_year_cont_wider
longer and call the resulting table obs_by_year_cont_longer
. You should find that obs_by_year_cont_longer
is the same as n_obs_by_year_cont
. Use kable()
to display the two tables neatly in your rendered html file.
# your code here
- Following the example from class using
%in%
, create a subset of thegapminder
dataset that has just the data for Argentina, El Salvador, and Uruguay before 1990, and give this dataset an appropriate name. Then reformat the dataset so that each row is a country, each column is a year, and each cell of the table is the per-capita GDP. (If you would like a little extra challenge, make each cell of the table the total GDP by multiplying population and per-capita GDP.) Display the table in the rendered html file usingkable()
.
# your code here
- For extra practice, work your way through some of the examples here for both pivoting longer and pivoting wider. These examples are designed to help you gradually learn about additional options you can set during pivoting and how to handle problems of varying complexity. I recommend trying the first couple examples in the pivoting longer section and the
tidycensus
example in the pivoting wider section. For each example you check out, I recommend first reading the section about it and the code that goes with it, then try it out yourself. For many of them you don’t have to install any new packages, but some liketidycensus
will require that you install a package (such astidycensus
) first in order to access the dataset they use. Then explain (here, in text, below this problem and before Section B) one or two things you learned.
Section B: Plots
In this section, the code chunks are not created for you. Create a new code chunk below each problem and before the next to put your code for that problem. If the problem asks you questions that you should answer in words instead of code, write your answers in text below the problem rather than as comments in the code.
Plot per-capita GDP versus life expectancy for Colombia as a line, points, or both. Give the plot appropriate axis labels and a title. Try out a few themes and use a theme you like. Set the color and transparency of the line/points.
Plot per-capita GDP over time for Argentina, El Salvador, and Uruguay before 1990. Make the plot three different ways: first plot all three countries on the same plot, then use
facet_wrap()
to plot each country in its own subplot, then usefacet_grid()
. Comment on which plot settings you need for each plot, if there are any that differ between the two plots. Also comment on what happens if you omit thefacet_wrap
argumentscales = "free"
.Combine pivoting and
facet_grid()
to make a plot with 9 subplots in a 3x3 grid, where the columns are country (Japan, China, or South Korea), the rows are variables (life expectancy, population, or per-capita GDP), and the x-axis is time. Usescales = "free_y"
as an argument tofacet_grid()
. Your plot should look something like this:
Take your time with this. Break the problem down into steps. Identify at least one step you’ll have to do, and then work backwards or forwards from there: What do I need to do in order to be able to do that step? What will I do once I have that step?
If you have time, here are further extensions you can try with this plot:
- Improve the axis labels, theme, and/or other features.
- Scale the variables so that the y-axis scales are nicer or more easily readable.
- What could you do to get the variables to be labeld in the plot as “Population”, “LifeExpectancy”, and “GDPperCapita”? (If you want, you can go further and try to make them “Population”, “Life Expectancy”, and “GDP per Capita” and even add units.)
- Try using
scales = "free"
infacet_grid
. Does it do what you expect? Try replacingfacet_grid(..., scales = "free")
withggh4x::facet_grid2(..., scales = "free_y", independent = "y")
, i.e. use thefacet_grid2()
function from theggh4x
package, and see how you like it.