pandas
  1. pandas-dataframegroupby

DataFrame.groupby() - ( Pandas DataFrame Basics )

Heading h2

Syntax

DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

Example

import pandas as pd

data = {'Name': ['John', 'Alex', 'Mia', 'Ann', 'Pat', 'Jim', 'Kim', 'Ron', 'Dan', 'Rex', 'Joe', 'Don'], 
          'Department': ['IT', 'Sales', 'HR', 'IT', 'HR', 'IT', 'Sales', 'HR', 'Sales', 'IT', 'HR', 'Sales'], 
          'Salary': [55000, 68000, 45000, 63000, 56000, 59000, 72000, 48000, 69000, 53000, 49000, 71000], 
          'Experience': [3, 5, 2, 8, 4, 7, 6, 2, 4, 1, 3, 5]
       }

df = pd.DataFrame(data)

dept_groups = df.groupby('Department')

for department, group in dept_groups:
    print('Employees in', department)
    print(group)
    print()

Output

Employees in HR
   Name Department  Salary  Experience
2   Mia         HR   45000           2
4   Pat         HR   56000           4
7   Ron         HR   48000           2
10  Joe         HR   49000           3

Employees in IT
    Name Department  Salary  Experience
0   John         IT   55000           3
3    Ann         IT   63000           8
5    Jim         IT   59000           7
9    Rex         IT   53000           1

Employees in Sales
   Name Department  Salary  Experience
1  Alex      Sales   68000           5
6   Kim      Sales   72000           6
8   Dan      Sales   69000           4
11  Don      Sales   71000           5

Explanation

The groupby() function in Pandas allows us to group rows of a DataFrame by the values of a particular column. In the example above, we group the employees by their department. Then we iterate over each group and print the employees in that group.

Use

The groupby() function is useful when you want to split your data into groups based on some criteria, such as the values in a particular column. Once the data is grouped, it can be processed or analyzed in various ways, such as computing summary statistics for each group or plotting the data for each group separately.

Important Points

  • The groupby() function is used to group rows of a DataFrame by the values of a particular column
  • The groupby() function returns a GroupBy object which can be iterated over to access each group
  • The groupby() function is useful for data analysis and processing tasks

Summary

In conclusion, the groupby() function in Pandas is a powerful tool for grouping rows of a DataFrame based on some criteria. Once the data is grouped, it can be processed or analyzed in various ways. The groupby() function returns a GroupBy object which can be iterated over to access each group. The groupby() function is useful for data analysis and processing tasks.

Published on: