DataFrame.count() - Pandas DataFrame Basics
The count
method is a built-in function provided by the Pandas library to count the number of non-null values in each column of a DataFrame object. This method is useful for data preparation and cleaning tasks, where you want to count the number of missing or valid values in a DataFrame.
Syntax
The basic syntax to use the count
method in Pandas is as follows:
DataFrame.count(axis=0, level=None, numeric_only=False)
axis
: axis along which to count the non-null values (default is0
, meaning to count along columns)level
: if the DataFrame has multi-level index, this parameter specifies which level to count alongnumeric_only
: if set toTrue
, only numeric columns will be counted (default isFalse
, meaning all columns will be counted)
Example
Consider the following example where the count
method is demonstrated on a simple DataFrame:
import pandas as pd
data = {'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'age': [25, 30, None, 20, 35],
'weight': [55.5, 70.2, 80.1, None, 65.7]}
df = pd.DataFrame(data)
print(df.count())
In this example, we create a DataFrame with 3 columns: name
, age
, and weight
and with 5 rows of data. In the age
and weight
columns, there are null
values. The count
method is used to count the number of non-null values in each column of the DataFrame. The output of this code is as follows:
name 5
age 4
weight 4
dtype: int64
This means that there are 5 rows in the DataFrame and all columns have at least 4 non-null values.
Explanation
The count
method counts the number of non-null values in each column of the DataFrame. It returns a Series object, where the index represents the columns of the DataFrame and the values represent the number of non-null values in each column.
If the numeric_only
parameter is set to True
, only columns with numeric data will be counted. If the level
parameter is specified, only the columns at the specified level of multi-level columns will be counted.
Use
The count
method is used to count the number of non-null values in each column of a Pandas DataFrame. This can be used to identify missing (null) values and perform data cleaning or data preparation tasks accordingly.
Important Points
- The
count
method can be used to count the non-null values in each column of a Pandas DataFrame. - It returns a Series object where the index represents the columns of the DataFrame and the values represent the non-null values in each column.
- The
numeric_only
parameter can be used to count only numeric columns. - The
level
parameter can be used to count only the columns at the specified level of multi-level columns.
Summary
In this tutorial, we learned how to use the count
method in Pandas to count the non-null values in each column of a DataFrame. This method is useful for identifying missing values and performing data cleaning tasks.