Read a XML file from Azure blob storage using Spark
Автор: Data Engineering Studies
Загружено: 2024-01-07
Просмотров: 955
Описание:
Use case:
Read an XML file from Azure blob storage using Spark pool in Azure Synapse Analytics
Prerequisites:
XML File
Azure Synapse Analytics workspace
Spark pool
Steps:
1) Install the XML package in the Apache Spark
2) Upload the file inside a container in Azure blob storage
3) Grant “Storage Blob data contributor” access to the user for the container
4) Create a notebook in Synapse Analytics to read the XML file
5) Run the notebook to view the data
Download:
You can download the JAR from the below link
https://libraries.io/maven/com.databr...
Video to grant “Storage Blob data contributor” access to the user for the container
• Read a CSV file from Azure blob storage us...
Code:
from pyspark.sql import SparkSession
from pyspark.sql.types import *
path = "wasbs://[email protected]/books.xml"
df = spark.read \
.format("com.databricks.spark.xml") \
.option("rowTag", "book") \
.load(path)
display(df)
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: