Databricks PySpark Explode and Pivot Columns

The explode function in PySpark is used to transform a column with an array of values into multiple rows. Each row of the resulting DataFrame will contain one element of the original array column.

Here is an example of how to use the explode function:

from pyspark.sql.functions import explode
# create a sample DataFrame
data = [("Alice", [1, 2, 3]), ("Bob", [4, 5]), ("Charlie", [6])]
df = spark.createDataFrame(data, ["name", "numbers"])# explode the numbers column
df_exploded = df.select("name", explode("numbers").alias("number"))# show the result
df_exploded.show()

Website

Tags: Columns pivot