Skip to content
Advertisement

Excel like vlookup using python pandas with some conditions

I need to perform vlookup on dataframe using python/pandas like in Excel with some conditions.

Condition:-

  1. I need to create a one new column (DFM) in my 2nd DataFrame using Excel like vlookup.

  2. If DFM value is na then print 100% in 2nd Dataframe’s DFM. Like in below result data.

  3. In result data DFM column I need to apply vlookup for first 10 rows only. Then from 11th row I want to copy first 10 rows data again and again but after dropping first row every time. Please look at the result example below. As we will drop first row on every iteration we will have empty last row so we need to print 100% there as given in condition 2.

1st DataFrame:-

  S.No     Cal     Date             DFM
  1            8       01-03-2013   100.00%
  1            7       01-06-2013   100.00%
  1            6       01-09-2013   100.00%
  1            5       01-12-2013   99.99%
  1            4       01-03-2014   99.97%
  1            3       01-06-2014   99.95%
  1            2       01-09-2014   98.89%
  1            1      01-12-2014   39.36%
  1            0      01-03-2015   0.00%
  1            0      01-06-2015   0.00%
  1            0      01-09-2015   0.00%
  1            0      01-12-2015   0.00%
  1            8      01-03-2013   100.00%
  1            7      01-06-2013   100.00%
  1            6      01-09-2013   100.00%
  1            5      01-12-2013   99.99%
  1            4      01-03-2014   99.97%
  1            3      01-06-2014   99.95%
  1            2      01-09-2014   98.89%
  1            1      01-12-2014   39.36%
  1            0      01-03-2015   0.00%
  1            0      01-06-2015   0.00%
  1            0      01-09-2015   0.00%
  1            0      01-12-2015   0.00%
  1            8      01-03-2013   100.00%
  1            7      01-06-2013   100.00%
  1            6      01-09-2013   100.00%
  1            5      01-12-2013   99.99%
  1            4      01-03-2014   99.97%
  1            3      01-06-2014   99.95%
  1            2      01-09-2014   98.89%
  1            1      01-12-2014   39.36%
  1            0      01-03-2015   0.00%
  1            0      01-06-2015   0.00%
  1            0      01-09-2015   0.00%
  1            0      01-12-2015   0.00%
  

2nd DataFrame:-

      Cal     Group        
        1     period 1    
        2     period 1    
        3     period 1    
        4     period 1   
        5     period 1    
        6     period 1    
        7     period 1    
        8     period 1    
        9     period 1    
       10    period 1         
        1     period 1    
        2     period 1    
        3     period 1    
        4     period 1   
        5     period 1    
        6     period 1    
        7     period 1    
        8     period 1    
        9     period 1    
       10    period 1        
        1     period 1    
        2     period 1    
        3     period 1    
        4     period 1   
        5     period 1    
        6     period 1    
        7     period 1    
        8     period 1    
        9     period 1    
       10    period 1    

Result:-

        Cal     Group       DFM
         1     period 1    39.36%
         2     period 1    98.89
         3     period 1    99.95%
         4     period 1    99.97%
         5     period 1    99.99%
         6     period 1    100.00%
         7     period 1    100.00%
         8     period 1    100.00%
         9     period 1    100.00%
        10    period 1    100.00%
         1      period 1    98.89
         2     period 1    99.95%
         3     period 1    99.97%
         4     period 1    99.99%
         5     period 1    100.00%
         6     period 1    100.00%
         7     period 1    100.00%
         8     period 1    100.00%
         9     period 1    100.00%
        10    period 1    100.00%
         1     period 1    99.95%
         2     period 1    99.97%
         3     period 1    99.99%
         4     period 1    100.00%
         5     period 1    100.00%
         6     period 1    100.00%
         7     period 1    100.00%
         8    period 1    100.00%
         9    period 1    100.00%
        10    period 1    100.00%
     

Advertisement

Answer

In many programming languages the equivalent of vlookups is some implenation of SQL joins. In pandas that’s either df.merge() for columns or df.join() for indexes.

To create the shifting effect, I concatenated a list of each shifted piece.

import pandas as pd

df1 = pd.read_csv("temp/df1.csv").drop_duplicates()
df2 = pd.read_csv("temp/df2.csv").drop_duplicates()

vlookuped = (df2.merge(df1, how="left", on=["Cal"])
             .drop(columns=["S.No", "Date"]))

result = (
    pd.concat([
        vlookuped,
        vlookuped.assign(DFM=lambda x: x["DFM"].shift(-1)),
        vlookuped.assign(DFM=lambda x: x["DFM"].shift(-2))],
        ignore_index=True)
    .fillna("100.00%"))

This code produces the following dataframe:

Cal    Group     DFM
  1 period 1  39.36%
  2 period 1  98.89%
  3 period 1  99.95%
  4 period 1  99.97%
  5 period 1  99.99%
  6 period 1 100.00%
  7 period 1 100.00%
  8 period 1 100.00%
  9 period 1 100.00%
 10 period 1 100.00%
  1 period 1  98.89%
  2 period 1  99.95%
  3 period 1  99.97%
  4 period 1  99.99%
  5 period 1 100.00%
  6 period 1 100.00%
  7 period 1 100.00%
  8 period 1 100.00%
  9 period 1 100.00%
 10 period 1 100.00%
  1 period 1  99.95%
  2 period 1  99.97%
  3 period 1  99.99%
  4 period 1 100.00%
  5 period 1 100.00%
  6 period 1 100.00%
  7 period 1 100.00%
  8 period 1 100.00%
  9 period 1 100.00%
 10 period 1 100.00%
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement