pandas
  1. pandas-dataframecount

DataFrame.count() - Pandas DataFrame Basics

The count method is a built-in function provided by the Pandas library to count the number of non-null values in each column of a DataFrame object. This method is useful for data preparation and cleaning tasks, where you want to count the number of missing or valid values in a DataFrame.

Syntax

The basic syntax to use the count method in Pandas is as follows:

DataFrame.count(axis=0, level=None, numeric_only=False)
  • axis: axis along which to count the non-null values (default is 0, meaning to count along columns)
  • level: if the DataFrame has multi-level index, this parameter specifies which level to count along
  • numeric_only: if set to True, only numeric columns will be counted (default is False, meaning all columns will be counted)

Example

Consider the following example where the count method is demonstrated on a simple DataFrame:

import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'], 
        'age': [25, 30, None, 20, 35], 
        'weight': [55.5, 70.2, 80.1, None, 65.7]}

df = pd.DataFrame(data)

print(df.count())

In this example, we create a DataFrame with 3 columns: name, age, and weight and with 5 rows of data. In the age and weight columns, there are null values. The count method is used to count the number of non-null values in each column of the DataFrame. The output of this code is as follows:

name      5
age       4
weight    4
dtype: int64

This means that there are 5 rows in the DataFrame and all columns have at least 4 non-null values.

Explanation

The count method counts the number of non-null values in each column of the DataFrame. It returns a Series object, where the index represents the columns of the DataFrame and the values represent the number of non-null values in each column.

If the numeric_only parameter is set to True, only columns with numeric data will be counted. If the level parameter is specified, only the columns at the specified level of multi-level columns will be counted.

Use

The count method is used to count the number of non-null values in each column of a Pandas DataFrame. This can be used to identify missing (null) values and perform data cleaning or data preparation tasks accordingly.

Important Points

  • The count method can be used to count the non-null values in each column of a Pandas DataFrame.
  • It returns a Series object where the index represents the columns of the DataFrame and the values represent the non-null values in each column.
  • The numeric_only parameter can be used to count only numeric columns.
  • The level parameter can be used to count only the columns at the specified level of multi-level columns.

Summary

In this tutorial, we learned how to use the count method in Pandas to count the non-null values in each column of a DataFrame. This method is useful for identifying missing values and performing data cleaning tasks.

Published on: