How to set blob conf in pyspark environment session

Elixir :

I have a pyspark script in which I initiate a spark session, but I am not able to read from blob store using the spark.read.format('json').load("my_blob_path"). Below is my session initialisation. Please help me set my blob credentials in the environment.

conf = SparkConf().setAppName("session1")
sc = SparkContext(conf=conf)
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("session1").getOrCreate()

ashwin agrawal :

You can set credentials of your azure-blob storage account using the spark.conf.set after you have initialised your spark session.

Below is the code:

conf = SparkConf().setAppName("session1")
sc = SparkContext(conf=conf)
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("session1").getOrCreate()

spark.conf.set("fs.azure.account.key.{blob_account_name}.blob.core.windows.net","{blob_account_key}")

This will set the account in your path and then you can read from blob using the spark.read.format('json').load('wasb://{blob_container}@{blob_account_name}.blob.core.windows.net/{blob_path}')

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=9524&siteId=1