DataFrame.dropna() - ( Pandas Data Operations and Processing )
Heading h2
Syntax
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
Example
import pandas as pd
data = {'name': ['John', 'Mary', 'Joe', 'Jane', 'Eva', 'Tom', 'Bob', 'Ann'],
'age': [20, 30, None, 40, 25, 35, None, None],
'country': [None, 'USA', 'UK', 'France', None, 'USA', 'Japan', 'Canada']}
df = pd.DataFrame(data)
# drop rows with missing values
df = df.dropna()
print(df)
Output
name age country
1 Mary 30.0 USA
3 Jane 40.0 France
5 Tom 35.0 USA
Explanation
The dropna()
method in Pandas is used to remove missing or null values from a DataFrame. By default, it will remove any rows that contain any missing values.
The method takes several parameters, such as axis
, how
, thresh
, and subset
, to control the behavior of the function.
In the above example, we create a DataFrame with some missing values and then remove all rows with any missing values using the dropna()
method.
Use
The dropna()
method is typically used to clean and remove missing values from a DataFrame before performing analysis or modeling. It can be used to remove rows or columns that contain missing values, or to remove only rows or columns that have more than a certain threshold of missing values.
Important Points
dropna()
is used to remove missing or null values from a DataFrame- It takes several parameters to control the behavior of the function, such as
axis
,how
,thresh
, andsubset
- By default, it removes any rows that contain any missing values
- It is typically used to clean and remove missing values from a DataFrame before performing analysis or modeling
Summary
In conclusion, dropna()
is a useful method in Pandas for removing missing or null values from a DataFrame. It is widely used to clean and prepare data before analysis or modeling. By default, it will remove any rows that contain any missing values, but it can be customized with various parameters to control the behavior of the function.