Overview
In this tutorial, we’ll learn how to create a scatter plot in R using the ggplot2
package. Here’s a brief overview of what we’ll cover:

ggplot
function and specify the data frame.
aes
function to map the variables (x and y axes).
geom_point()
to create the scatter plot.




Getting started
We’ll use the candy
dataset throughout this tutorial. Here’s a preview of the data:
name | sales | price | rating | year | category |
---|---|---|---|---|---|
Jelly Beans | 300 | 2.5 | 4.5 | 2019 | Chewy |
Gummy Bears | 150 | 1.5 | 3.8 | 2020 | Chewy |
Lollipop | 200 | 1 | 4 | 2021 | Hard |
Cotton Candy | 100 | 2 | 4.2 | 2022 | Soft |
Jolly Ranchers | 250 | 1.8 | 4.7 | 2023 | Hard |
Marshmallow | 180 | 1.2 | 3.5 | 2024 | Soft |
To view the code to create the candy
dataset, click the button below:
What we’ll create
We’ll create a scatter plot that shows the relationship between candy price and sales, with points colored by category.

Steps to create a scatter plot
Let’s go through the process of creating this scatter plot step by step.
Step 1: Start a ggplot and specify the data
ggplot(data = candy)

Step 2: Add aesthetics
ggplot(data = candy) +
aes(x = price, y = sales)

Step 3: Add geometric objects
ggplot(data = candy) +
aes(x = price, y = sales) +
geom_point()

Try running the code below to see a scatter plot of candy price vs sales:
Change the x-axis to rating
and the y-axis to price
.
Now, let’s improve our scatter plot by adding more elements and customizing its appearance.
Step 4: Format axes
ggplot(candy) +
aes(x = price, y = sales) +
geom_point() +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 350))

Step 5: Add labels and titles
ggplot(candy) +
aes(x = price, y = sales) +
geom_point() +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 350)) +
labs(alt = "Scatter plot of candy price vs sales",
title = "Relationship between candy price and sales",
subtitle = "Price ($) vs Sales (units)",
caption = "Source: The School of Data",
x = "Price ($)", y = "Sales (units)") +
theme(plot.title.position = "plot")

Step 6: Format text
ggplot(candy) +
aes(x = price, y = sales) +
geom_point() +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 350)) +
labs(alt = "Scatter plot of candy price vs sales",
title = "Relationship between candy price and sales",
subtitle = "Price ($) vs Sales (units)",
caption = "Source: The School of Data",
x = "Price ($)", y = "Sales (units)") +
theme_minimal() +
theme(text = element_text(family = "PT Sans"),
plot.title.position = "plot",
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(face = "italic", size = 12),
axis.text = element_text(size = 12))

Step 7: Customize points
ggplot(candy) +
aes(x = price, y = sales, color = category) +
geom_point(size = 4) +
scale_x_continuous(limits = c(0, 3)) +
scale_y_continuous(limits = c(0, 350)) +
labs(alt = "Scatter plot of candy price vs sales, colored by category",
title = "Relationship between candy price and sales",
subtitle = "Price ($) vs Sales (units), by candy category",
caption = "Source: The School of Data",
x = "Price ($)", y = "Sales (units)",
color = "Category") +
theme_minimal() +
theme(text = element_text(family = "PT Sans"),
plot.title.position = "plot",
plot.caption.position = "plot",
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(face = "italic", size = 12),
axis.text = element_text(size = 12),
legend.position = "top")

Create a scatter plot showing the relationship between rating
and sales
. Color the points by category
and adjust the point size to 3.
Loading...
Loading...
Loading...
Review
We’ve covered the steps to create a scatter plot in R using ggplot2
. Here’s a summary of the key points:
Step 1: Start with the ggplot
function and specify the data frame.
Step 2: Add aesthetics using the aes
function to map the variables (x and y axes).
Step 3: Add geometric objects with geom_point()
to create the scatter plot.
Step 4: Format the axes using scale_x_continuous()
and scale_y_continuous()
.
Step 5: Add labels and titles with the labs
function.
Step 6: Format text and customize the appearance of the plot using the theme
function.
Step 7: Customize points by adding color based on a categorical variable and adjusting point size.
Nice! In the next section, we’ll learn how to create a pie chart.