Pyspark: Get substring of Column Name

I’m new to PySpark and want to change my column names as most of them have an annoying prefix. My column names are like this:

e1013_var1
e1014_var2
e1015_var3
Data_date_stamp

If existent, I want to remove the EXXX_ from the column names, how to do that? As I also want everything in Uppercase, my code so far looks like this

for col in df.columns:
    df= df.withColumnRenamed(col, col.upper())  

Help is appreciated, thank you!

Answer

One option that avoids a for cycle is using toDF to rename all the columns of a Spark dataframe

import re

df_new = df.toDF(*[re.sub('ed+_', '', c).upper() for c in df.columns])

print(df_new.columns)
# ['VAR1', 'VAR2', 'VAR3', 'DATA_DATE_STAMP']