Data Visualization with R and ggplot2

Create a Scatter Plot

Learn how to create a scatter plot in R with ggplot2.

Course Sections

Overview

In this tutorial, we’ll learn how to create a scatter plot in R using the ggplot2 package. Here’s a brief overview of what we’ll cover:

Getting started

We’ll use the candy dataset throughout this tutorial. Here’s a preview of the data:

namesalespriceratingyearcategory
Jelly Beans3002.54.52019Chewy
Gummy Bears1501.53.82020Chewy
Lollipop200142021Hard
Cotton Candy10024.22022Soft
Jolly Ranchers2501.84.72023Hard
Marshmallow1801.23.52024Soft

To view the code to create the candy dataset, click the button below:

What we’ll create

We’ll create a scatter plot that shows the relationship between candy price and sales, with points colored by category.

Scatter plot of candy price vs sales
    Scatter plot

Steps to create a scatter plot

Let’s go through the process of creating this scatter plot step by step.

Step 1: Start a ggplot and specify the data

     ggplot(data = candy) 
    Base layer

Step 2: Add aesthetics

     ggplot(data = candy) +
      aes(x = price, y = sales) 
    Aesthetics added

Step 3: Add geometric objects

     ggplot(data = candy) +
      aes(x = price, y = sales) +
      geom_point() 
    Scatter plot
Exercise 3.1

Try running the code below to see a scatter plot of candy price vs sales:

Exercise 3.2

Change the x-axis to rating and the y-axis to price .

Now, let’s improve our scatter plot by adding more elements and customizing its appearance.

Step 4: Format axes

     ggplot(candy) +
      aes(x = price, y = sales) + 
      geom_point() +
      scale_x_continuous(limits = c(0, 3)) + 
      scale_y_continuous(limits = c(0, 350)) 
    Scatter plot with formatted axes

Step 5: Add labels and titles

     ggplot(candy) +
      aes(x = price, y = sales) + 
      geom_point() +
      scale_x_continuous(limits = c(0, 3)) + 
      scale_y_continuous(limits = c(0, 350)) +
      labs(alt = "Scatter plot of candy price vs sales",
           title = "Relationship between candy price and sales",
           subtitle = "Price ($) vs Sales (units)",
           caption = "Source: The School of Data", 
           x = "Price ($)", y = "Sales (units)") +
      theme(plot.title.position = "plot") 
    Scatter plot with labels and titles

Step 6: Format text

     ggplot(candy) +
      aes(x = price, y = sales) + 
      geom_point() +
      scale_x_continuous(limits = c(0, 3)) + 
      scale_y_continuous(limits = c(0, 350)) +
      labs(alt = "Scatter plot of candy price vs sales",
           title = "Relationship between candy price and sales",
           subtitle = "Price ($) vs Sales (units)",
           caption = "Source: The School of Data", 
           x = "Price ($)", y = "Sales (units)") +
       theme_minimal() +
       theme(text = element_text(family = "PT Sans"),
           plot.title.position = "plot",
           plot.title = element_text(face = "bold", size = 16),
           plot.subtitle = element_text(face = "italic", size = 12),  
           axis.text = element_text(size = 12)) 
    Scatter plot with formatted text

Step 7: Customize points

     ggplot(candy) +
      aes(x = price, y = sales, color = category) + 
      geom_point(size = 4) +
      scale_x_continuous(limits = c(0, 3)) + 
      scale_y_continuous(limits = c(0, 350)) +
      labs(alt = "Scatter plot of candy price vs sales, colored by category",
           title = "Relationship between candy price and sales",
           subtitle = "Price ($) vs Sales (units), by candy category",
           caption = "Source: The School of Data", 
           x = "Price ($)", y = "Sales (units)",
           color = "Category") +
       theme_minimal() +
       theme(text = element_text(family = "PT Sans"),
           plot.title.position = "plot",
           plot.caption.position = "plot",
           plot.title = element_text(face = "bold", size = 16),
           plot.subtitle = element_text(face = "italic", size = 12),  
           axis.text = element_text(size = 12),
           legend.position = "top") 
    Final scatter plot with customized points
Exercise 3.3

Create a scatter plot showing the relationship between rating and sales . Color the points by category and adjust the point size to 3.

Quiz

    Loading...

    Loading...

    Loading...

Review

We’ve covered the steps to create a scatter plot in R using ggplot2 . Here’s a summary of the key points:

Step 1: Start with the ggplot function and specify the data frame.

Step 2: Add aesthetics using the aes function to map the variables (x and y axes).

Step 3: Add geometric objects with geom_point() to create the scatter plot.

Step 4: Format the axes using scale_x_continuous() and scale_y_continuous() .

Step 5: Add labels and titles with the labs function.

Step 6: Format text and customize the appearance of the plot using the theme function.

Step 7: Customize points by adding color based on a categorical variable and adjusting point size.

Nice! In the next section, we’ll learn how to create a pie chart.