Data visualization is a graphical approach for quickly communicating key insights and sharing information. The ggplot2 package in the tidyverse offers a way to create beautiful and customizable charts. Creating charts with code might initially seem slow and challenging. However, learning to create with code will simplify and automate the data analysis process, ultimately saving time and effort.
What we’ll learn in this section
goalsIn this section, we'll learn how to:
create a simple bar chart
create a simple scatter plot
customize the appearance of the charts
add titles and labels to the charts
change the color of the bars and points
By the end of this section, we’ll be able to create the following two charts:
Bar Chart & Scatter Plot
What is ggplot2?
ggplot2 is a data visualization package that is part of the tidyverse . It is based on a concept called the grammar of graphics, which is a way of thinking about creating graphics in a structured and consistent way.
If you don’t have tidyverse or ggplot2 installed, you can install using:
And then, you can load the library using:
Making a Bar Chart
To create a ggplot, we pass the name of the dataset and then the x and y axis columns inside the aes() function.
We can add a new layer using + and specify the type of chart we want to create. In this case, we are creating a bar chart using the geom_bar function.
Inside the geom_bar() function we specify the type of statistic we want to use.
In this case, we’ll use stat = "identity" which uses the values we have in a column.
Example
Let’s create a simple bar chart using the ggplot function with the flowers dataset.
name
height
season
sunlight
growth
Poppy
75
Spring
8.3
fast
Rose
150
Summer
6.4
slow
Zinnia
60
Summer
8.7
fast
Peony
90
Spring
7.2
slow
Exercise 3.1
Run this code to create a bar chart using the flowers dataset.
Bar Chart
Exercise 3.2
Replace y = height with y = sunlight and see what happens. (Double click on the _____ and edit the code)
Bar Chart
Adding Text
A chart is more informative when we add text to it. We can add text to the chart using the labs() function.
The alt argument is used to add alternative text to the chart. This is essential for screen readers to read aloud the text for those who may experience vision challenges. The title and subtitle give context to the chart. The x and y arguments are used to label the x and y axes. The caption argument is a good place to add the source of the data.
argument
description
alt
Alternative text for the chart
title
Title of the chart
subtitle
Subtitle of the chart
x
Label for the x-axis
y
Label for the y-axis
caption
Caption for the chart
Example
If we add the following arguments to the labs() function:
We’ll get this chart with the text added.
The alt-text will not be visible in the chart but is important for screen readers.
Exercise 3.3
Run the code below to add labels to the plot.
Bar Chart
Formatting Theme Elements
We can change the appearance of the chart by adding themes. We will use the inbuilt theme_minimal() function to remove the grid lines and add a white background.
And we’ll add modifications to the theme by removing the x-axis grid lines and moving the title to the top of the plot area.
Example
If we add the following arguments to the theme() function:
We’ll get this chart with the theme modifications added.
Exercise 3.4
Run the code below to add a theme to the plot.
Bar Chart
Making a Scatter Plot
A scatter plot shows the relationship between two numerical variables.
In this section, we’ll make this plot:
We can create a scatter plot using the geom_point() function.
The geom_point() function takes in two arguments: x and y. We pass an aesthetic or aes() inside mapping. And then inside aes() we add our x and y variables.
Example
Let’s create a simple scatter plot using the ggplot function with the flowers dataset.
name
height
season
sunlight
growth
Poppy
75
Spring
8.3
fast
Rose
150
Summer
6.4
slow
Zinnia
60
Summer
8.7
fast
Peony
90
Spring
7.2
slow
Syntax to create sample data:
By default, ggplot will zoom in on the dots by adjusting the x and y axis limits.
We can change the limits by adding the xlim() and ylim() functions. The first value before the comma is the lower limit and the second value is the upper limit.
Exercise 3.5
Run this code to create a scatter plot using the flowers dataset.
Bar Chart
Adding Titles and Labels
Example
We’ll add alt text, a title, and a subtitle to our scatter plot using the labs() function.
Modifying Theme Elements
We can change the appearance of the chart by adding themes. We will use the inbuilt theme_minimal() function to remove the grid lines and add a white background.
And we’ll add modifications to the theme by removing the x-axis grid lines and moving the title to the top of the plot area.
Exercise 3.6
Run this code to add labels to the scatter plot.
Bar Chart
Review
Nice work! In this section, we learned how to create simple charts using ggplot2. We learned how to create a bar chart and a scatter plot. We also learned how to customize the appearance of the charts, add titles and labels, and change the color of the bars and points.
Quiz
Loading...
Loading...
summaryWe've learned how to
Select columns using the select function
Select columns based on their positions
Select a range of columns
Unselect columns using the - operator
Select columns based on their data type using the where function