DataFrame.apply() - Pandas DataFrame Basics
The DataFrame.apply()
method in Pandas is a powerful function that allows you to apply a function to every row or column of a DataFrame. It takes a function as an argument and applies it to every value in the DataFrame, returning a new DataFrame with the results.
Syntax
The syntax for using DataFrame.apply()
is:
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)
func
: The function to apply to the DataFrame or Series.axis
: Specifies whether to apply the function along the rows (axis=0
) or columns (axis=1
) of the DataFrame.raw
: IfTrue
, the function is applied to the underlying NumPy array in the DataFrame, instead of the default pandas-aligned axis labels.result_type
: The type of the resulting DataFrame.args
: Arguments to pass tofunc
.**kwds
: Keyword arguments to pass tofunc
.
Example
Let's create a sample DataFrame first:
import pandas as pd
data = {'name':['John', 'Jane', 'Steve', 'Bill'],
'age':[28, 34, 29, 42],
'gender':['Male', 'Female', 'Male', 'Male']}
df = pd.DataFrame(data)
print(df)
This creates a DataFrame with three columns (name
, age
, and gender
) and four rows:
name age gender
0 John 28 Male
1 Jane 34 Female
2 Steve 29 Male
3 Bill 42 Male
Now, let's apply a function to the age
column to add 2 to each value:
def add_two(x):
return x + 2
df['age'] = df['age'].apply(add_two)
print(df)
The output should be:
name age gender
0 John 30 Male
1 Jane 36 Female
2 Steve 31 Male
3 Bill 44 Male
In this example, we define a function add_two
that takes a single argument and returns the argument plus 2. We then use apply()
to apply this function to every element of the age
column. The resulting DataFrame has each age in the age
column increased by 2.
Output
The output of our example will be:
name age gender
0 John 30 Male
1 Jane 36 Female
2 Steve 31 Male
3 Bill 44 Male
Explanation
The apply()
method applies the given function to every element in the specified axis and returns a new DataFrame with the results. In the example, we pass the function add_two
to the apply()
method, which takes each value in the age
column, adds 2 to it, and returns a new DataFrame with the resulting values.
Use
The apply()
method is commonly used to apply a function to every element in a DataFrame or Series. This can be useful for transforming or cleaning data, as well as for creating new columns based on existing data.
Important Points
- The
apply()
method allows you to apply a function to every element in a DataFrame or Series. - The
axis
argument specifies whether to apply the function along the rows (axis=0
) or columns (axis=1
) of the DataFrame. - The
apply()
method returns a new DataFrame with the results of the applied function.
Summary
In summary, the apply()
method in Pandas is a powerful function that allows you to apply a function to every element in a DataFrame or Series. It provides a convenient way to transform data or create new columns based on existing data.