DataFrame.groupby() - ( Pandas DataFrame Basics )
Heading h2
Syntax
DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)
Example
import pandas as pd
data = {'Name': ['John', 'Alex', 'Mia', 'Ann', 'Pat', 'Jim', 'Kim', 'Ron', 'Dan', 'Rex', 'Joe', 'Don'],
'Department': ['IT', 'Sales', 'HR', 'IT', 'HR', 'IT', 'Sales', 'HR', 'Sales', 'IT', 'HR', 'Sales'],
'Salary': [55000, 68000, 45000, 63000, 56000, 59000, 72000, 48000, 69000, 53000, 49000, 71000],
'Experience': [3, 5, 2, 8, 4, 7, 6, 2, 4, 1, 3, 5]
}
df = pd.DataFrame(data)
dept_groups = df.groupby('Department')
for department, group in dept_groups:
print('Employees in', department)
print(group)
print()
Output
Employees in HR
Name Department Salary Experience
2 Mia HR 45000 2
4 Pat HR 56000 4
7 Ron HR 48000 2
10 Joe HR 49000 3
Employees in IT
Name Department Salary Experience
0 John IT 55000 3
3 Ann IT 63000 8
5 Jim IT 59000 7
9 Rex IT 53000 1
Employees in Sales
Name Department Salary Experience
1 Alex Sales 68000 5
6 Kim Sales 72000 6
8 Dan Sales 69000 4
11 Don Sales 71000 5
Explanation
The groupby()
function in Pandas allows us to group rows of a DataFrame by the values of a particular column. In the example above, we group the employees by their department. Then we iterate over each group and print the employees in that group.
Use
The groupby()
function is useful when you want to split your data into groups based on some criteria, such as the values in a particular column. Once the data is grouped, it can be processed or analyzed in various ways, such as computing summary statistics for each group or plotting the data for each group separately.
Important Points
- The
groupby()
function is used to group rows of a DataFrame by the values of a particular column - The
groupby()
function returns aGroupBy
object which can be iterated over to access each group - The
groupby()
function is useful for data analysis and processing tasks
Summary
In conclusion, the groupby()
function in Pandas is a powerful tool for grouping rows of a DataFrame based on some criteria. Once the data is grouped, it can be processed or analyzed in various ways. The groupby()
function returns a GroupBy
object which can be iterated over to access each group. The groupby()
function is useful for data analysis and processing tasks.