pandas
  1. pandas-multiple-index

Multiple Index - ( Pandas Indexing )

Heading h2

Syntax

df.loc[(condition1) & (condition2), column_name]

Example

import pandas as pd

df = pd.DataFrame({'Country': ['USA', 'Canada', 'USA', 'Canada'],
                   'Year': [2010, 2010, 2011, 2011],
                   'GDP': [14964.4, 15826.7, 15542.2, 16603.4],
                   'Population': [309, 34, 311, 35]})

# create a multi-index DataFrame
df.set_index(['Country', 'Year'], inplace=True)

# select GDP values for USA in 2010 and Canada in 2011
gdp = df.loc[('USA', 2010), 'GDP':'GDP'].append(df.loc[('Canada', 2011), 'GDP':'GDP'])

print(gdp)

Output

Country  Year
USA      2010    14964.4
Canada   2011    16603.4
Name: GDP, dtype: float64

Explanation

In Pandas, multiple indexing is a way to create a hierarchical index for a DataFrame. The df.set_index() method can be used to set multiple columns as the index.

To select specific rows and columns from a multi-index DataFrame, we can use the df.loc[] accessor with the conditions for each level of the index. We can specify the conditions as tuples within a list and use the ampersand (&) operator to combine multiple conditions.

In the above example, we create a multi-index DataFrame with the columns 'Country' and 'Year' as the index. We then select the GDP values for 'USA' in 2010 and 'Canada' in 2011 using multi-indexing with the df.loc[] method. We set the conditions as tuples within a list and use the ampersand (&) operator to combine them. Finally, we select only the 'GDP' column and concatenate the resulting Series.

Use

Multiple indexing in Pandas is useful for handling and analyzing data with complex hierarchical structures. It allows us to access and manipulate data at different levels of the index.

Important Points

  • Multiple indexing is a way to create a hierarchical index for a DataFrame in Pandas
  • The df.set_index() method can be used to create multiple indexes from multiple columns
  • We can use the df.loc[] accessor with tuples of conditions and the ampersand (&) operator to select specific rows and columns from a multi-index DataFrame
  • Multi-indexing can be used for analyzing and manipulating data with complex hierarchical structures

Summary

In conclusion, multiple indexing in Pandas is a powerful tool for working with hierarchical data. We can use it to easily select and manipulate data at different levels of the index. The df.loc[] accessor with the ampersand (&) operator can be used to select specific rows and columns from a multi-index DataFrame.

Published on: