What is reshaping data?
Reshaping data is the process of changing the layout of your data to make it easier to analyze. Data can be in different formats, such as wide or long, depending on how it is structured. We can convert data from wide to long or from long to wide formats.
What we’ll learn in this section
- understand the concepts of wide and long data formats
- use pivot_longer() to reshape data from wide to long format
- use pivot_wider() to reshape data from long to wide format
Wide Format
In wide format, each row represents a unique observation, and each column represents a variable. This format is useful for storing data where each variable has its own column.
Consider the following data frame students
in wide format:
In this wide format, each student’s study and play hours are stored in separate columns.
Long Format
In long format, each row represents a unique observation. This format is useful for storing data where each variable is stored in a single column.
Consider the following data frame activity_hours
in long format:
In this long format, each student’s study and play hours are stored in separate rows.
Reshaping Data with tidyr
The tidyr
package in R provides functions to reshape data between wide and long formats. Two key functions for reshaping data are pivot_longer()
and pivot_wider()
.
Pivot Longer
The pivot_longer()
function is used to reshape data from wide to long format.
Let’s use the pivot_longer()
function to reshape the students data frame from wide to long format.
id | name | section | study | play |
---|---|---|---|---|
1 | Alia | A | 2 | 5 |
2 | Bala | B | 8 | 5 |
3 | Cara | A | NA | 10 |
4 | Dana | B | 4 | 10 |
Output:
Before (Wide Format)
id | name | section | study | play |
---|---|---|---|---|
1 | Alia | A | 2 | 5 |
2 | Bala | B | 8 | 5 |
3 | Cara | A | NA | 10 |
4 | Dana | B | 4 | 10 |
After (Long Format)
id | name | section | activity | hours |
---|---|---|---|---|
1 | Alia | A | study | 2 |
1 | Alia | A | play | 5 |
2 | Bala | B | study | 8 |
2 | Bala | B | play | 5 |
3 | Cara | A | study | NA |
3 | Cara | A | play | 10 |
4 | Dana | B | study | 4 |
4 | Dana | B | play | 10 |
Use pivot_longer()
to reshape the following grades
data frame from wide to long format. The new columns should be named “subject” and “score”.
student | math | science | history |
---|---|---|---|
Alice | 85 | 92 | 78 |
Bob | 91 | 85 | 89 |
Charlie | 76 | 88 | 95 |
Pivot Wider
The pivot_wider()
function is used to reshape data from long to wide format.
Let’s use the pivot_wider()
function to reshape a long format data frame into a wide format.
id | name | section | activity | hours |
---|---|---|---|---|
1 | Alia | A | study | 2 |
1 | Alia | A | play | 5 |
2 | Bala | B | study | 8 |
2 | Bala | B | play | 5 |
3 | Cara | A | study | NA |
3 | Cara | A | play | 10 |
4 | Dana | B | study | 4 |
4 | Dana | B | play | 10 |
Output:
Before (Long Format)
id | name | section | activity | hours |
---|---|---|---|---|
1 | Alia | A | study | 2 |
1 | Alia | A | play | 5 |
2 | Bala | B | study | 8 |
2 | Bala | B | play | 5 |
3 | Cara | A | study | NA |
3 | Cara | A | play | 10 |
4 | Dana | B | study | 4 |
4 | Dana | B | play | 10 |
After (Wide Format)
id | name | section | study | play |
---|---|---|---|---|
1 | Alia | A | 2 | 5 |
2 | Bala | B | 8 | 5 |
3 | Cara | A | NA | 10 |
4 | Dana | B | 4 | 10 |
Use pivot_wider()
to reshape the following grades_long
data frame from long to wide format. The new columns should be named “math”, “science”, and “history”.
student | subject | score |
---|---|---|
Alice | math | 85 |
Alice | science | 92 |
Alice | history | 78 |
Bob | math | 91 |
Bob | science | 85 |
Bob | history | 89 |
Charlie | math | 76 |
Charlie | science | 88 |
Charlie | history | 95 |
Review
In this section, we’ve learned about reshaping data between wide and long formats using the tidyr
package in R.
Loading...
Loading...
Loading...
- Reshape data from wide to long format using
pivot_longer()
- Reshape data from long to wide format using
pivot_wider()
We are almost done with the course! In the next section, we will conclude our learning journey with a summary of what we’ve covered and some self-assessment tasks. 🎉