Get hive and hadoop version from within pyspark session

I’m using pyspark upon a hadoop cluster with hive. I know its possible to get the spark, hive & hadoop versions from the command-line (spark-submit --version, hive --version, hadoop version) but how do I do the same from within pyspark?

getting the spark version is easy enough:

print("Spark version = ".format(spark._sc.version))

I can’t figure out how to get the hive & hadoop version though. Anyone know? TIA

Answer

Getting them from pyspark :

# spark
print(f"Spark version = {spark.version}")

# hadoop
print(f"Hadoop version = {sc._jvm.org.apache.hadoop.util.VersionInfo.getVersion()}")

Leave a Reply

Your email address will not be published. Required fields are marked *