- Select columns using pandas
- Select columns based on their positions
- Select a range of columns
- Drop columns
- Select columns based on their data type
A step by step approach using method chaining
In Python with pandas, we can use method chaining to perform operations step by step. This is similar to the pipe operator in R, but we use the dot (.) to chain methods in Python.
Selecting columns with column names
In pandas, we can select columns from a DataFrame using square brackets []
or the loc
accessor.
Syntax
You can read the following syntax as “take the dataset, and then select column1 and column2”.
In our flowers dataset, we have these columns: name
, height
, season
, sunlight
, and growth
.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
If we want to select the name
and height
columns, we can do this:
This code will select the name
and height
columns from the flowers dataset and print the df.
Run the code below to see the output of selecting the name
and height
columns from the flowers dataset.
Select the season
and sunlight
columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Selecting columns using column positions
We can also select columns using their positions with the iloc
accessor.
In the flowers dataset, we can select the first and third columns using their positions.
This code will select the first and third columns from the flowers dataset and print the df.
Run the code below to see the output of selecting the first and third columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Try selecting the second and fourth columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Selecting a range of columns
We can select a range of columns using slice notation.
This code will select the first three columns from the flowers dataset and print the df.
Run the code below to see the output of selecting the first three columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Try selecting the columns from the second to the fourth column.
Hint: Use 1:4
in the slice notation.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Dropping columns
If you want to drop columns, you can use the drop
method.
This code will drop the name
column from the flowers dataset and print the df.
Selecting Columns based on Data Type
You can select columns based on their data type using the select_dtypes
method.
Selecting numeric columns
Selecting object (string) columns
In the flowers dataset, we can select only the numeric columns:
This code will select only the height
and sunlight
columns from the flowers dataset and print the df.
Run the code below to see the output of selecting only the numeric columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Select only the object (string) columns from the flowers dataset.
name | height | season | sunlight | growth |
---|---|---|---|---|
Poppy | 75 | Spring | 8.3 | fast |
Rose | 150 | Summer | 6.4 | slow |
Zinnia | 60 | Summer | 8.7 | fast |
Peony | 90 | Spring | 7.2 | slow |
Loading...
Loading...
Loading...
- Select columns using pandas
- Select columns based on their positions
- Select a range of columns
- Drop columns
- Select columns based on their data type