Multiple Index - ( Pandas Indexing )
Heading h2
Syntax
df.loc[(condition1) & (condition2), column_name]
Example
import pandas as pd
df = pd.DataFrame({'Country': ['USA', 'Canada', 'USA', 'Canada'],
'Year': [2010, 2010, 2011, 2011],
'GDP': [14964.4, 15826.7, 15542.2, 16603.4],
'Population': [309, 34, 311, 35]})
# create a multi-index DataFrame
df.set_index(['Country', 'Year'], inplace=True)
# select GDP values for USA in 2010 and Canada in 2011
gdp = df.loc[('USA', 2010), 'GDP':'GDP'].append(df.loc[('Canada', 2011), 'GDP':'GDP'])
print(gdp)
Output
Country Year
USA 2010 14964.4
Canada 2011 16603.4
Name: GDP, dtype: float64
Explanation
In Pandas, multiple indexing is a way to create a hierarchical index for a DataFrame. The df.set_index()
method can be used to set multiple columns as the index.
To select specific rows and columns from a multi-index DataFrame, we can use the df.loc[]
accessor with the conditions for each level of the index. We can specify the conditions as tuples within a list and use the ampersand (&
) operator to combine multiple conditions.
In the above example, we create a multi-index DataFrame with the columns 'Country'
and 'Year'
as the index. We then select the GDP values for 'USA'
in 2010 and 'Canada'
in 2011 using multi-indexing with the df.loc[]
method. We set the conditions as tuples within a list and use the ampersand (&
) operator to combine them. Finally, we select only the 'GDP'
column and concatenate the resulting Series.
Use
Multiple indexing in Pandas is useful for handling and analyzing data with complex hierarchical structures. It allows us to access and manipulate data at different levels of the index.
Important Points
- Multiple indexing is a way to create a hierarchical index for a DataFrame in Pandas
- The
df.set_index()
method can be used to create multiple indexes from multiple columns - We can use the
df.loc[]
accessor with tuples of conditions and the ampersand (&
) operator to select specific rows and columns from a multi-index DataFrame - Multi-indexing can be used for analyzing and manipulating data with complex hierarchical structures
Summary
In conclusion, multiple indexing in Pandas is a powerful tool for working with hierarchical data. We can use it to easily select and manipulate data at different levels of the index. The df.loc[]
accessor with the ampersand (&
) operator can be used to select specific rows and columns from a multi-index DataFrame.