Goals In this section, you will learn how to:
- Calculate basic statistics of a DataFrame
- Group data and perform aggregate functions
- Use different methods to summarize data
Sample DataFrame
Let’s start with a sample DataFrame that we’ll use throughout this section:
id | name | age | city | salary |
---|---|---|---|---|
1 | Alice | 25 | New York | 50000 |
2 | Bob | 30 | San Francisco | 75000 |
3 | Carol | 35 | Chicago | 60000 |
4 | David | 40 | New York | 80000 |
5 | Emily | 28 | Chicago | 55000 |
Basic Statistics
Pandas provides several methods to calculate basic statistics of a DataFrame.
Syntax
Exercise 3.1: Basic Statistics
Run the code below to get basic statistics of the numerical columns:
Basic Statistics
Exercise 3.2: Specific Statistics
Modify the code below to calculate the mean age and median salary:
Calculate mean age and median salary
Grouping and Aggregation
Pandas allows you to group data by one or more columns and perform aggregate functions on other columns.
Syntax
Exercise 3.3: Grouping by One Column
Modify the code below to calculate the average salary for each city:
Average salary by city
Exercise 3.4: Grouping by Multiple Columns
Modify the code below to calculate the average age and total salary for each city:
Average age and total salary by city
Exercise 3.5: Advanced Summarization
Modify the code below to get the following summary:
- Total number of employees
- Average age of employees
- Highest salary
- City with the most employees
Advanced summary
Quiz
Loading...
Loading...
summary We've learned how to:
- Calculate basic statistics using methods like
describe()
,mean()
, andmedian()
- Group data using
groupby()
and perform aggregate functions - Combine multiple aggregations in a single operation
- Create custom summaries using various pandas functions