pandas
  1. pandas-dataframemerge

DataFrame.merge() - ( Pandas DataFrame Basics )

Heading h2

Syntax

merged_df = df1.merge(df2, on='common_column')

Example

import pandas as pd

df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
                    'value': [1, 2, 3, 4]})

df2 = pd.DataFrame({'key': ['C', 'D', 'E', 'F'],
                    'value': [5, 6, 7, 8]})

merged_df = df1.merge(df2, on='key')
print(merged_df)

Output

  key  value_x  value_y
0   C        3        5
1   D        4        6

Explanation

DataFrames in Pandas can be combined using the merge() function. This function merges two DataFrames on a specified column(s), using the SQL left join logic by default.

The on parameter specifies the column(s) to use for merging. By default, merge() will use the common columns between the two DataFrames. The resulting DataFrame will contain all rows from both DataFrames where the key values match.

Use

merge() is used for combining two DataFrames based on a common column. This is commonly used in data analysis and data cleaning, where data from multiple data sources may need to be combined into a single DataFrame.

Important Points

  • merge() is used to combine two DataFrames based on a common column
  • By default, merge() performs a left join, using the SQL join logic
  • The on parameter specifies the column(s) to use for merging
  • The resulting DataFrame contains all rows from both DataFrames where the key values match.

Summary

In conclusion, merge() is a function in Pandas that is used to combine two DataFrames based on a common column. It is a powerful tool in data analysis and data cleaning, and is a common operation in these fields. Understanding how to use merge() is critical to working with large and complex datasets, and enables data analysts and data scientists to effectively manipulate and transform data.

Published on: