Driver stacktrace in PySpark

I am trying to do following steps

df1 = df.na.drop(subset=["Column1", "Column2", "Column3", "Column4", "Column5","Column6"])
df1 =  df1.withColumn('Column6',df1['Column6'].cast(DoubleType()))
udf_dict = udf(lambda x,y: 1 if(x>=y) else 0,IntegerType())
df1 = df1.withColumn('Flag',udf_dict('Column2','Column6'))
filter1 = df1.filter(df1['Flag'] == 1)

It is giving me following error

enter image description here enter image description here

Please suggest where it is going wrong

Answer

There are nulls in your dataframe, which causes the UDF to fail. However, there is no need to use an UDF here. You can just filter the dataframe by comparing the two columns directly.

df1 = df.na.drop(subset=["Column1", "Column2", "Column3", "Column4", "Column5","Column6"])
df1 = df1.withColumn('Column6',df1['Column6'].cast(DoubleType()))
filter1 = df1.filter('Column2 >= Column6')