Elixir :
I have a pyspark
script in which I initiate a spark
session, but I am not able to read from blob store using the spark.read.format('json').load("my_blob_path")
. Below is my session initialisation. Please help me set my blob credentials in the environment.
conf = SparkConf().setAppName("session1")
sc = SparkContext(conf=conf)
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("session1").getOrCreate()
ashwin agrawal :
You can set credentials of your azure-blob storage account using the spark.conf.set
after you have initialised your spark session.
Below is the code:
conf = SparkConf().setAppName("session1")
sc = SparkContext(conf=conf)
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("session1").getOrCreate()
spark.conf.set("fs.azure.account.key.{blob_account_name}.blob.core.windows.net","{blob_account_key}")
This will set the account in your path and then you can read from blob using the spark.read.format('json').load('wasb://{blob_container}@{blob_account_name}.blob.core.windows.net/{blob_path}')