pandas
  1. pandas-dataframeastype

DataFrame.astype() - Pandas DataFrame Basics

DataFrame.astype() is a method used to cast all or selected columns of a Pandas DataFrame to another data type. It is useful for converting the data type of columns to facilitate data analysis, data cleaning, and modeling.

Syntax

The basic syntax to cast the data type of columns using astype() is as follows:

dataframe.astype(dtype, copy=True, errors='raise')
  • dtype: The new data type to cast the columns to. For example, 'int', 'float', 'bool', etc.
  • copy: If True, returns a new DataFrame with the casted columns. If False, returns a view of the original DataFrame with the casted columns.
  • errors: If 'raise', raises an error for any column that cannot be casted to the specified data type. If 'ignore', keeps the original column data type for that column.

Example

Consider the following example, where we cast the data types of columns in a DataFrame using astype():

import pandas as pd

data = {'name': ['John', 'Jane', 'Mike', 'Emily'], 
        'age': [25, 19, 37, 42], 
        'salary': [50000, 35000, 75000, 90000],
        'employed': [True, True, False, True]}

df = pd.DataFrame(data)

casted_df = df.astype({'age': 'float', 'salary': 'int64'})

print(casted_df.dtypes)

In this example, we define a dictionary of data and create a DataFrame from it. We then use the astype() method to cast the 'age' column to float and 'salary' column to int64. We print the resulting data types of the casted DataFrame.

Output

name        object
age        float64
salary       int64
employed      bool
dtype: object

In this example, name and employed columns are unchanged since the data types of these columns were not specified in the astype() method.

Explanation

The astype() method in Pandas allows us to cast the data type of columns in a DataFrame. It takes a dictionary of column names to new data types. By default, it returns a new copy of the DataFrame with the casted columns, but this can be changed with the copy parameter.

Use

The astype() method is useful for converting the data types of columns to facilitate data analysis, data cleaning, and modeling. For example, it can be used to convert string values to numeric values for mathematical operations, or to convert boolean columns to integer columns for machine learning models.

Important Points

  • astype() is used to cast the data type of columns in a Pandas DataFrame.
  • The method takes a dictionary of column names to new data types.
  • By default, the method returns a new copy of the DataFrame with the casted columns.

Summary

The astype() method in Pandas is a powerful tool for converting the data types of columns in a DataFrame. It provides a flexible way to cast the data types of selected or all columns of a DataFrame, which is useful for data cleaning, analysis, and modeling tasks.

Published on: