pandas
  1. pandas-how-to-drop-rows-in-pandas

Pandas: How to Drop Rows in Pandas

Dropping rows in a Pandas DataFrame is a common operation when working with data, especially when dealing with missing or irrelevant information. This guide covers the syntax, example, output, explanation, use cases, important points, and a summary of how to drop rows in Pandas.

Syntax

import pandas as pd

# Dropping rows based on conditions
df.drop(df[df['column'] > threshold].index, inplace=True)

# Dropping rows by index
df.drop(index=[index1, index2], inplace=True)
  • df: The Pandas DataFrame.
  • column: The column used for condition-based row dropping.
  • threshold: The threshold value for the condition.
  • index1, index2: The indices of rows to be dropped.

Example

import pandas as pd

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'San Francisco', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# Dropping rows where age is greater than 30
df.drop(df[df['Age'] > 30].index, inplace=True)

# Displaying the modified DataFrame
print(df)

Output

    Name  Age           City
0  Alice   25       New York
1    Bob   30  San Francisco

Explanation

  • The df.drop() method is used to drop rows based on conditions or indices.
  • In the example, rows where the age is greater than 30 are dropped using a condition.

Use

  • Dropping rows is useful for data cleaning, filtering, and removing unnecessary information.
  • It helps in handling missing or outlier data points that might affect analysis.

Important Points

  • The inplace=True parameter modifies the DataFrame in place. If set to False (default), a new DataFrame with the rows dropped is returned.
  • When using conditions, make sure to select the correct rows to drop.

Summary

Dropping rows in Pandas is a powerful technique for cleaning and preparing data. Whether you need to remove outliers, filter based on specific conditions, or handle missing data, the df.drop() method provides a flexible and efficient solution. Understand the syntax, apply it to your specific use cases, and leverage it as part of your data preprocessing workflow.

Published on: