Skip to content
Advertisement

Checking if column headers match PYTHON

I have two dataframes:

df1:

      ID  Open High Low  
       1  64   66   52   

df2

      ID Open High  Volume
      1   33   45   30043

I want to write a function that checks if the column headers are matching/the same as columns in df1.

IF not we get a message telling us what column is missing.

Example of the message given these dataframes:

  "The column 'Low' is not selected in df2. The column 'Volume' is not selected in df1' 

I want a generalized code that can work for any given dataframe.

Is this possible on python?

Advertisement

Answer

You can have access to the column names via .columns and then use set operations to check what you want:

import pandas as pd

df1 = pd.DataFrame(
    {
        "ID": [1],
        "Open": [64],
        "High": [66],
        "Low": [52]
    }
)

df2 = pd.DataFrame(
    {
        "ID": [1],
        "Open": [33],
        "High": [45],
        "Volume": [30043]
    }
)

df1_columns = set(df1.columns)
df2_columns = set(df2.columns)

common_columns = df1_columns & df2_columns

df1_columns_only = df1_columns - common_columns
df2_columns_only = df2_columns - common_columns

print("Columns only available in df1", df1_columns_only)
print("Columns only available in df2", df2_columns_only)

And it gives the expected output:

Columns only available in df1 {'Low'}
Columns only available in df2 {'Volume'}
Advertisement