pandas
  1. pandas-dataframepivot-table

DataFrame.pivot_table() - ( Pandas DataFrame Basics )

Heading h2

Syntax

DataFrame.pivot_table(values=None, index=None, columns=None, 
                      aggfunc='mean', fill_value=None, margins=False, 
                      dropna=True, margins_name='All')

Example

import pandas as pd

df = pd.DataFrame({'A': ['foo', 'foo', 'bar', 'bar', 'foo', 'foo', 'bar', 'bar'],
                   'B': ['one', 'one', 'one', 'two', 'two', 'one', 'one', 'two'],
                   'C': [1, 2, 3, 4, 5, 6, 7, 8],
                   'D': [9, 10, 11, 12, 13, 14, 15, 16]})

table = pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc='sum')
print(table)

Output

C        1     2     3     4     5     6     7     8
A   B                                              
bar one  NaN   NaN  11.0   NaN   NaN  15.0  35.0  39.0
    two  NaN   NaN   NaN  12.0  13.0   NaN   NaN  16.0
foo one  9.0  10.0   NaN   NaN   NaN  14.0   NaN   NaN
    two  NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN

Explanation

The DataFrame.pivot_table() method is used to create a spreadsheet-style pivot table as a DataFrame.

The values parameter specifies which column to aggregate.

The index parameter specifies which column(s) to use as the row index.

The columns parameter specifies which column(s) to use as the column index.

The aggfunc parameter specifies how to aggregate the data. The default value is 'mean'.

Use

DataFrame.pivot_table() is useful when working with large datasets, as it allows you to easily summarize and aggregate data. It is particularly useful when working with cross-tabulations and contingency tables.

Important Points

  • DataFrame.pivot_table() is used to create a pivot table as a DataFrame
  • The values parameter specifies which column to aggregate
  • The index parameter specifies which column(s) to use as the row index
  • The columns parameter specifies which column(s) to use as the column index
  • The aggfunc parameter specifies how to aggregate the data. The default value is 'mean'

Summary

In conclusion, DataFrame.pivot_table() is a useful method in pandas for creating a pivot table as a DataFrame. It allows you to easily summarize and aggregate data by specifying the appropriate parameters. It is particularly useful when working with cross-tabulations and contingency tables.

Published on: