How to explode an array column in spark java with dataset Code Answer

Hello Developer, Hope you guys are doing great. Today at Tutorial Guruji Official website, we are sharing the answer of How to explode an array column in spark java with dataset without wasting too much if your time.

The question is published on by Tutorial Guruji team.

I have a Dataset in spark java as: Current:

+--------------+--------------------+
|          x   |               YS.   |
+--------------+--------------------+
|x1            |   [Y1,Y2]          |
|x2            |   [Y3]             |

I want to explode this Dataset and convert the array in to individual entry as”

Desired:

+--------------+--------------------+
|          x   |    YS.   
+--------------+--------------------+
|x1            |   Y1          
|X1            |.  Y2
|x2            |   Y3            

I read the table from database and read the two column but unable to use the explode functionality.

DS = reader.option("table", "dummy").load()
                .select(X,YS).explode(??)

How should I use the explode and get the desired Dataset with Java.

Answer

In the principle, you need to select a new column (not the YS column), where the value of the new column will be an exploded YS column value.

Starting from the code from the question, this would be something like:

ds = reader.option("table", "dummy").load()
ds = ds.select(ds.col("X"), explode(ds.col("YS")).as("Y"))

Here is the API doc: https://spark.apache.org/docs/2.4.6/api/java/org/apache/spark/sql/functions.html#explode-org.apache.spark.sql.Column-

We are here to answer your question about How to explode an array column in spark java with dataset - If you find the proper solution, please don't forgot to share this with your team members.

Related Posts

Tutorial Guruji