Day 3: Data visualization with ggplot2 package

Qingyin Cai

Department of Applied Economics
University of Minnesota

Learning Objectives

Learn the basic operations of ggplot2 package to create figures.
You will be able to create:
- scatter plot
- line plot
- bar plot
- histogram
- box plot
- density plot
- facet plot

Reference

Taste of ggplot2 package

By the end of the lecture, you will be able to create the figures like the following examples using ggplot2 package.

Example 1
Example 2
Example 3
Example 4
Example 5

There are many functions in the ggplot2 package to create figures, and today’s lecture is not a comprehensive guide to all of them.
We will focus on the basic functions to create the most common types of figures.

Before Starting

Install the package ggplot2 and gapminder locally if you haven’t already done so.

install.packages('ggplot2')
install.packages('gapminder')

Once you have the package in R, let’s load it.

Note

There is a package called tidyverse, which is a collection of R packages designed for data science.
When you load the tidyverse package, the ggplot2 package is automatically loaded.

Introduction to `ggplot2`

What is it?
How does it work?

As you know, there are already base (built-in) R functions to create figures (e.g., plot() and hist())
- pros: they are fast (especially for plotting a large dataset).
- cons: The plots are difficult to customize.

The ggplot2 package provides more flexibility and customization options for creating figures with consistent syntax.
- Check this out to see what kind of figures ggplot2 can make.
Variety of extensional packages built on top of ggplot2 (e.g., ggthemes, ggpubr, ggrepel, gganimate, etc.) allows you to create more complex figures.
- See this for examples.

ggplot2 views a figure as the collection of multiple independent layers.
- layers for geometric objects (e.g., points, lines, bars), layers for aesthetic attributes of the geometric objects (color, shape, size), layers of annotations and statistical summaries, … etc.
Then, it combines these layers to create a single figure as a final output.

Anatomy of ggplot2

Use the right-arrow (or down-arrow) key to move through the steps. The left column shows the code. The right column shows the plot it produces. Watch how the plot changes each time a new line of code is added.

# Create a canvas for the plot
ggplot(data = airquality)

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind)

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis 
  aes(y = Ozone)

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) + 
  # Add y-axis
  aes(y = Ozone) +
  # Add a scatter plot
  geom_point()

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm")

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)")

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)") +
  # Change y-axis label
  labs(y = "Ozone (ppb)")

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)") +
  # Change y-axis label
  labs(y = "Ozone (ppb)") +
  # Add title and subtitle
  labs(
    title = "Relationship between ozone and wind speed in New York",
    subtitle = "May to September 1973"
  )

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)") +
  # Change y-axis label
  labs(y = "Ozone (ppb)") +
  # Add title and subtitle
  labs(
    title = "Relationship between ozone and wind speed in New York",
    subtitle = "May to September 1973"
  ) +
  # Add caption
  labs(caption = "Data source:")

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)") +
  # Change y-axis label
  labs(y = "Ozone (ppb)") +
  # Add title and subtitle
  labs(
    title = "Relationship between ozone and wind speed in New York",
    subtitle = "May to September 1973"
  ) +
  # Add caption
  labs(caption = "Data source:") +
  # Set the theme
  theme_bw()

# Create a canvas for the plot
ggplot(data = airquality) + 
  # Add x-axis
  aes(x = Wind) +
  # Add y-axis
  aes(y = Ozone) + 
  # Add a scatter plot
  geom_point() +
  # Add a regression line
  geom_smooth(method = "lm") +
  # Change x-axis label
  labs(x = "Wind Speed (mph)") +
  # Change y-axis label
  labs(y = "Ozone (ppb)") +
  # Add title and subtitle
  labs(
    title = "Relationship between ozone and wind speed in New York",
    subtitle = "May to September 1973"
  ) +
  # Add caption
  labs(caption = "Data source:") +
  # Set the theme
  theme_bw() +
  # Center the title and subtitle position
  theme(
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5)
  )

Note: This code is for demonstration purposes. Don’t imitate this code!

Anatomy of ggplot2 (continued)

Every ggplot2 plot has three key components:
- Data
- A set of aesthetic mappings between variables in the data and visual properties.
- At least one layer which describes how to render each observation. Layers are usually created with a geom function.

The very general syntax for creating a plot with ggplot2 is as follows:

ggplot(data = ...) +
  geom_*(aes( ... ))

aes stands for aesthetic mappings. It tells ggplot2 how to map variables in the data to visual properties of the plot (e.g., x-axis, y-axis, color, shape, size, etc.)
+ operator tells R that you’re adding another layer (e.g., line plot) to the current “canvas”.
Depending on the type of the figure you want to plot, use different geom_*() functions.
- Eg. geom_point() for scatter plot, geom_line() for line plot, etc.

Example

Data
Let’s Create a Scatter Plot
Step 1
Step 2

Let’s use the airquality data for this example.

airquality data is a built-in dataset in R. So, you don’t need to load it.
Type airquality in the console to see the data. (Type ?airquality in the console for more information.)

We will create a scatter plot of Ozone (ozone level in the air) and Temp (Maximum daily temperature in degrees \(F\)) from the airquality data.

The final plot should look like the following:

Step 1: Start with ggplot()

ggplot(data = dataset) initializes a ggplot object. In other words, it prepares a “canvas” for the plot.
Here, let R know the dataset you are trying to visualize.

Try it!
Why?

Run the following code. Can you see any output?

This code does not produce any output because we haven’t told R what to plot with the data yet.
ggplot() just prepares a blank “canvas” for you!

Step 2: Draw figures with geom_*() functions, and add to the current canvas use + operator

For example, we use geom_point() to create a scatter plot.
- use aes() to specify which variable you want to use for x and y axis.
aes() is used to tell R to look for the variables inside the dataset you specified in ggplot(), and use the information as specified.
e.g., aes(x = Temp, y = Ozone) tells R to look for Temp and Ozone in the data, and to map the data to x-axis and y-axis, respectively.

Summary

These are basic steps to create a figure with ggplot2 package.

Step1: Start with ggplot()
- This function prepares a “canvas” for the figure.
Step2: Draw a figure with geom_*() function, and add to the current canvas with + operator.
Step3: Repeat Step2 and Step3 to add whatever layers you want to add.
Step4 (optional): Add labels, titles, and other annotations to the plot with labs(), theme(), etc.

Don’t forget to specify x and y variables in the aes() function.
- Also, some geom_*() functions only require x variable (e.g., geom_histogram()).
In step 3, layers can be added in any order, but the order of the layers affects the final appearance of the plot.
When you want to make a simple x and y plot, the base R functions are sufficient (e.g., with(data, plot(column_x, column_y))

In-class exercise

Questions
Answers

Create a scatter plot of Temp and Wind from the airquality data.
In the plot you just created, let’s change the x-axis label to “Maximum temperature (degrees F)” and the y-axis label to “Wind Speed (mph)”. For this, use labs() function.

Hint:

labs(x = new_x_label, y = new_y_label)
use + to add this layer to the plot.

Create a scatter plot of Temp and Wind from the airquality data.
In the plot you just created, let’s change the x-axis label to “Maximum temperature (degrees F)” and the y-axis label to “Wind Speed (mph)”. For this, use labs() function.

Hint:

labs(x = new_x_label, y = new_y_label)
use + to add this layer to the plot.

Different Types of Plot

You can create various plots with the ggplot2 package by choosing the appropriate geom_*() function for the desired plot type.
Here are some of the most commonly used geom_*() functions.
- geom_point(): scatter plot
- geom_line(): line plot
- geom_bar(): bar plot
- geom_boxplot(): box plot
- geom_histogram(): histogram
- geom_density(): density plot
  - This computes and draws kernel density estimates, and is a smoothed version of the histogram.
- geom_smooth(): draws an OLS-estimated regression line (other regression methods available)
see this for full list of geom_*()

Modify Aesthetic Attributes

Basics
Examples

We can modify how plots look by specifying color, shape, and size.

Here are list of options to control the aesthetics of figures. You use these options inside the geom_*().

size: control the size of points and text
- e.g., geom_point(size = 3)
color: control color of the points and lines
- e.g.,geom_point(color = "blue")
fill: control the color of the inside areas of figures like bars and boxes
- e.g., geom_density(fill = "blue") fills the area under the density curve with blue color
alpha controls the transparency of the fill color
- e.g., alpha=1 is opaque, alpha=0 is completely transparent, usually between 0 and 1
shape: controls the symbols of point, it takes integer values between 0 and 25
- e.g., geom_point(shape = 1) for circle, geom_point(shape = 2) for triangle

For point shapes available in R, see this.
For further information about the options for aesthetics, see this.

Scatter Plot
Histogram
Line Plot

size = 3: makes the points larger.
color = "red": changes the color of the points to red.
shape = 1: changes the shape of the points to circle.

fill = "blue": fills the bars with blue color.
alpha = 0.5: makes the fill color semi-transparent.
- Try changing the value of alpha to see how the transparency changes.

linewidth = 1.5: makes the line thicker.
color = "purple": changes the color of the line to purple.
linetype = "dotted": changes the line type to dotted.

In-class Exercise

Exercise 1
Exercise 2

Questions
Answers

Create a density plot of Ozone from the airquality data. Fill the area under the density curve with blue and make it semi-transparent (use alpha = 0.5).

The figure should look like the following:

Questions
Answers

Create box plots of monthly Temp from the airquality data. Fill the boxes with green and make it semi-transparent (use alpha = 0.5).

The figure should look like the following:

Hint

This is a bit of a tricky problem, but very useful!
We want to use Month as a categorical variable for the x-axis, but Month is a numeric variable in the data. How can we tell R to use it as a categorical (factor) variable?
- Apply factor() function to a Month to convert it to a factor variable in aes() in the geom_*() function.

Group Aesthetic

Basics
In-class Exercise

So far, we specified aesthetic attributes outside of the aes() function. Consequently, all the geometric objects in the plot have the same color, shape, and size, etc.

e.g., geom_point(aes(x = var_x, y = var_y), color = "red").

If you use those options inside the aes() function like aes(color = var_z), R will display different colors by group based on the value of var_z. Usually var_z is a categorical variable.

e.g., geom_point(aes(x = var_x, y = var_y, color = var_z)) displays a scatter plot where the points are colored differently based on the value of var_z.

Example

Let’s create density plots of Temp for each month, and use different colors (fill in this case) for different Month.

Questions
Answers

Create a scatter plot of Ozone and Temp in the airquality data. Let’s use different colors for different Month.
In addition to the previous plot, let’s use different shapes for different Month.

NOTE: Remember that we need to tell R to use Month as a categorical variable.

Create a scatter plot of Ozone and Temp in the airquality data. Let’s use different colors for different Month.
In addition to the previous plot, let’s use different shapes for different Month.

NOTE: Remember that we need to tell R to use Month as a categorical variable.

Collective geoms

Basics
Example

So far, we used only one geom_*() function in a plot. But you can use multiple geom_*() functions in a single plot.

This just overlays multiple layers of different geometric objects on the same “canvas”.
Use + operator to add multiple geom_*() functions to the plot.

Example Syntax

ggplot(data = dataset) +
  geom_*(aes(x = column_x, y = column_y, fill = column_z)) +
  geom_*(aes(x = column_x, y = column_y)) +
  geom_*(aes(x = column_x, y = column_y)) +
  ...

If an additional layer has the same aes() mapping, you can specify it only once in the ggplot().

# The above code is equivalent to the following code
ggplot(data = dataset, aes(x = column_x, y = column_y) +
  geom_*(fill = column_z)) +
  geom_*() +
  geom_*() +
  ...

Note

Recall that ggplot() prepares a plot object.
If you tell ggplot() to use aes() mapping from the beginning, you don’t need to specify it again in the geom_*() functions.

Let’s create a scatter plot of Ozone and Temp from the airquality data.
In addition to the scatter plot, let’s add a simple regression line to the plot using geom_smooth() function.

Modify Axis, Legend, and Plot Labels

Basics
Examples

By default, x-axis, y-axis, and legend labels are the column names of the data, which are not always informative. Also, you might want to add a title and subtitle to the plot.
You can modify the labels, titles, and other annotations of the plot using labs() function.

Example Syntax

ggplot(data = dataset) +
  geom_*(aes(x = column_x, y = column_y)) +
  labs(
    x = "X-axis label",
    y = "Y-axis label",
    title = "Title of the plot",
    subtitle = "Subtitle of the plot",
    caption = "Data source"
  )

Example 1
Example 2

Note
+ If you use color (fill) for group aesthetic, you need to use color (fill) in the labs() function to change the legend title.

Summary

Let’s summarize what we have learned so far.

the basic syntax of the ggplot2 package.
how to create a popular types of plots (scatter plot, line plot, bar plot, histogram, box plot, density plot).
how to modify aesthetic attributes of the plot (color, shape, size, etc.)
how to use group aesthetic to group the data by a variable
how to use multiple geom_*() functions in a single plot.
how to modify axis, legend, and plot labels with labs() function.

Exercise Problems

Exercise Problems 1
Exercise Problems 2

Data
Instructions
Solutions

Let’s use the economics data, which is a dataset built into the ggplot2 package. It was produced from US economic time series data available from Federal Reserve Economic Data. This contains the following variables:

date: date in year-month format
pce: personal consumption expenditures, in billions of dollars
pop: total population in thousands
psavert: personal savings rate
uempmed: median duration of unemployment in weeks
unemploy: number of unemployed in thousands

1. Create a scatter plot of unemploy (x-axis) and psavert (y-axis). Add a simple regression line to the plot. Change the x-axis, y-axis, and fill legend labels to something more informative.

2. Create a bar plot of psavert by date. Use pop for fill color. Change the x-axis, y-axis, and fill legend labels to something more informative.

Hint: use stat = 'identity' in the geom_bar() function to plot the actual values of pce.

3. (Challenging) Create a multiple line plot taking day as x-axis and psavert and uempmed as y-axis, respectively. The output should look like the following.

Hint: I think there are multiple ways to do this.

Data
Instructions
Solutions

For this exercise problem, we will use medical cost personal datasets descried in the book “Machine Learning with R” by Brett Lantz. The dataset provides \(1,338\) records of medical information and costs billed by health insurance companies in 2013, compiled by the United States Census Bureau.

The dataset contains the following variables:

age: age of primary beneficiary
sex: insurance contractor gender, female, male
bmi: body mass index, providing an understanding of body, weights that are relatively high or low relative to height
children: number of children covered by health insurance
smoker: smoking
region: the beneficiary’s residential area in the US; northeast, southeast, southwest, northwest.
charges: individual medical costs billed by health insurance

Download the data

Create a histogram of charges by sex in the same plot. Fill the boxes with different colors for each sex.
Create a scatter plot of bmi (x-axis) and charges (y-axis).
Now, create a scatter plot of bmi (x-axis) and charges (y-axis), and add regression lines by smoke (So, there are two regression lines: one for group of smokers and the other for group of non-smokers).
Create the following plot.

Section 2: Advanced Topics

Before We Start

For this section, we will continue to use the economics and insurance data we used in the previous exercise problems.

Facet Plot

Intro
facet_wrap()
facet_grid()

You can partition a plot into a matrix of panels and display a different subset of the data in each panel. This is useful when you want to compare patterns in the data by group.

Example 1
Example 2

Without faceting

Because the scales of the y-axis are different by variable, it is hard to compare the trends across variables in the same plot.

With faceting

Here, I am showing the distribution of charges by sex and region in the same plot.

facet_wrap() makes a long ribbon of panels (generated by any number of variables). You can also wrap it into 2 rows.

Syntax:

facet_wrap(vars(var_x, var_y), scales = "fixed", nrow = 2, ncol = 2)

Inside vars(), specify variables used for faceting groups.
ncol and nrow control the number of columns and rows (you only need to set one).
scales controls the scales of the axes in the panel (either "fixed" (the default), "free_x", or "free_y", "free").

Try it!

Play around with the facet_wrap() function in the code below. See how the choice of faceting groups, number of rows and columns and the scales of the axes affect the appearance of the plot.

facet_grid() produces a 2 row grid of panels defined by variables which form the rows and columns.

Syntax:

facet_grid(rows = vars(var_x), cols = var(var_y)), scales = "fixed")

The graph is partitioned by the levels of the groups var_x and var_y in the rows and columns, respectively.
ncol and nrow control the number of columns and rows (you only need to set one).
scales controls the scales of the axes in the panel (either fixed (the default), free_x, or free_y, free).

Try it!

Multiple Datasets in One Figure

Basics
Example
How can we make the legend?

So far, we have been using the same dataset for each layer of the plot. But you can use multiple datasets in a single plot.

Note

If you specify data in ggplot() at the beginning (e.g., ggplot(data = dataset)), the data applies to ALL the subsequent geom_*()s unless overwritten locally inside individual geom_*()s.
To use multiple datasets in a single plot, you just need to specify what dataset to use locally inside individual geom_*()s.

insurance_southwest is a subset of the insurance data where region is southwest.
insurance_northeast is a subset of the insurance data where region is northeast.

You can do something like this:

ggplot2 Themes (Optional)

You can change the theme of the plot.

ggplot2 ships several pre-made themes that you can apply to your plots. (e.g, theme_minimal(), theme_bw() (I use this often), theme_classic()). See this.
ggthemes package provides additional ggplot themes. See this for full list of available themes.

Try it!

theme() Function (Optional)

theme() function let you tweak the details of all non-data related components of a plot (e.g., font type in the plot, position of the legend and title, etc.). There are so many components you can modify with the theme() function. See this for full list of options.

For more information, see:

Chapter 17 Themes, ggplot2: Elegant Graphics for Data Analysis (3e)

Try it!

For example, you can change the position of the title and legend with the following theme() options.

Save the Plot

Basics

Two options + Use the ggsave() function from the ggplot2 package. + Use the “Export” button in the RStudio plot viewer.

Syntax:

ggsave(filename, plot = plot_object)

filename: the name of the file (including path) to save the plot to. (e.g., filename = “Data/plot.png”)
plot: the plot object to save.

Example

Run the following code on your RStudio. Make sure you are opening the RProject.

library(ggplot2)
library(rio)

insurance_url <- "https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/insurance.csv"
insurance <- import(insurance_url)

ggplot(data = insurance) +
  geom_boxplot(aes(x = sex, y = charges, fill = region)) +
  labs(
    x = "Sex",
    y = "Medical costs",
    title = "Distribution of individual medical mosts by sex and region"
  )

# --- Sve plot --- #
ggsave(filename = "Data/insurance.png")
ggsave(filename = "Data/insurance.pdf")
ggsave(filename = "Data/insurance2.png", plot = plot_insurance)

:::

Summary 2

For this second section, you learned a few advanced topics in ggplot2.

Now, you know;

how to create facet plots with facet_wrap() and facet_grid().
when to use facet_wrap() and facet_grid().
how to visualize multiple datasets in a single plot.
how to save the plot.

That’s it!

Exercise Problems

Exercise Problem 1 (basic)
Exercise Problem 2 (basic)
Exercise Problem 3 (challenging)

Data
Instructions
Solutions

For this exercise problem, you will use the gapminder data from the gapminder package.

Find the number of unique countries in the data.
Calculate the mean life expectancy for the entire dataset.
Create a dataset by subsetting the data for the year 2007. Create a scatter plot of GDP per capita vs. life expectancy for the year 2007, color-coded by continent.
Create a bar plot showing the total population for each continent in 2007. Fill the bars with blue and set the transparency to 0.5.
Subset the data for the United States, China, India, and the United Kingdom. Create a line plot showing the change in life expectancy over time for these countries.
Create a scatter plot of GDP per capita vs. life expectancy for the entire gapminder dataset. Use facet_wrap to create separate plots for each continent.
Group the data by continent and calculate the mean GDP per capita for each continent for each year. Create a line plot showing the trend of mean GDP per capita for each continent over time.

Data
Instruction
Solutions

For this exercise problem, we will use economics dataset from the ggplot2 package. You need to use data manipulation and visualization techniques using the data.table and ggplot2 packages.

As you already know by now, the economics dataset contains various economic indicators for the United States. We want to create a line plot showing the trends of all economic indicators over time. Each economic indicator is stored in a separate column in the data, and you can visualize each indicator by creating a single line plot, separately. But, there is a better way to do this. It should look like the following plot.

Instruction
Solutions

For this exercise problem, you will use “corn_yield_dt.rds” in the “Data” folder. I obtained this from USDA-NASS Quick Stats database. The data contains the county-level corn yield data (in BU / ACRE) for each major corn production state in the US Midwest from 2000 to 2022.

Load the data and take a look at it.
Convert the data to a data.table object. The Value column contains the corn yield data. Rename the column to yield.
Let’s derive the state-level annual average corn yield data by calculating the mean of corn yield by state and year. Create a line plot of the annual trend of corn yield in Minnesota by taking year for the x-axis and the derived mean yield for the y-axis.
Create line plots showing the trend of annual corn yield for each state in the same plot.
Create a facet plot showing each state’s annual corn yield trend. To compare the trends across states, use scales = "fixed".

Hint: state_alpha is the two-letter state abbreviation for each state.

Create a new dataset that contains the overall average corn yield across states by taking the mean of the yield by year. Add a line plot of this dataset to the plot you created in the previous step. Use red dashed line to represent this line.

If you could add a legend to the plot to indicate what the red dashed line means, that would be great! To do this, you need to use scale_color_manual() function.

Day 3: Data visualization with ggplot2 package

Learning Objectives

Reference

Today’s Outline:

Taste of ggplot2 package

Before Starting

Introduction to `ggplot2`

Anatomy of ggplot2

Anatomy of ggplot2 (continued)

Example

Summary

In-class exercise

Different Types of Plot

Modify Aesthetic Attributes

In-class Exercise

Group Aesthetic

Collective geoms

Modify Axis, Legend, and Plot Labels

Summary

Exercise Problems

Section 2: Advanced Topics

Before We Start

Facet Plot

facet_wrap() vs facet_grid()

How can we modify the facet labels?

Multiple Datasets in One Figure

ggplot2 Themes (Optional)

Try it!

theme() Function (Optional)

Try it!

Save the Plot

Example

Summary 2

Exercise Problems

Day 3: Data visualization with ggplot2 package

Learning Objectives

Reference

Today’s Outline:

Taste of ggplot2 package

Before Starting

Introduction to ggplot2

Anatomy of ggplot2

Anatomy of ggplot2 (continued)

Example

Summary

In-class exercise

Different Types of Plot

Modify Aesthetic Attributes

In-class Exercise

Group Aesthetic

Collective geoms

Modify Axis, Legend, and Plot Labels

Summary

Exercise Problems

Section 2: Advanced Topics

Before We Start

Facet Plot

facet_wrap() vs facet_grid()

How can we modify the facet labels?

Multiple Datasets in One Figure

ggplot2 Themes (Optional)

Try it!

theme() Function (Optional)

Try it!

Save the Plot

Example

Summary 2

Exercise Problems

Introduction to `ggplot2`