Skip to content
Advertisement

New column comparing dates in PySpark

I am struggling to create a new column based off a simple condition comparing two dates. I have tried the following:

.withColumn("terms", when(col("start_date") <= col("end_date")), col("sarter_terms")).otherwise(col("exp_terms")) 

Which yields a syntax error.

I have also updated as follows:

.withColumn("terms", when(col("start_date").leq(col("end_date"))), col("sarter_terms")).otherwise(col("exp_terms")) 

But this yields a Python error that the Column is not callable.

How would I create a new column that dynamically adjusts based on whether the date comparator holds.

Advertisement

Answer

Your first statement had parenthesis mismatch , resulting in Column not callable error

.withColumn("terms", when(col("start_date") <= col("end_date")), col("sarter_terms")).otherwise(col("exp_terms"))

Change it to

.withColumn("terms", when(
                  col("start_date") <= col("end_date")
                , col("sarter_terms")
              ).otherwise(col("exp_terms"))
          )

Always Fan out the parenthesis to correctly measure the closing ones are in the appropriate places

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement