spark write data to minio test

Would like to test the machine, spark read write to s3 cloud storeage.

minio is a good choice, the lightweight, compatible aws s3 agreement.

You can use docker do.

# Pull Mirror

Minio pull Docker / Minio

# starting container
Docker -p 9000 RUN: 9000 --name minio1 \
--network Test \
  -e "MINIO_ACCESS_KEY Minio =" \
  -e "MINIO_SECRET_KEY minio123 =" \
  -v / the Users / student2020 / Data / Minio / the Data /: / the Data \
  Minio / Minio Server / the Data

to be logged in the browser, and then add a new bucket at the bottom right of the plus sign,
the storage format is s3a: // bucket_name / dir_to_path

Take the following written using spark jar package,
 AWS-Java-SDK-1.7.4.jar, hadoop-AWS-2.7.3.jar
 these two files can be found in the directory after the installation package hadoop unpacked directly find . -name "* aws * .jar" can be.
 
Memory-shell---executor the Spark --driver-2g 2g Memory \
--jars /Users/student2020/app/hadoop273/share/hadoop/tools/lib/aws-java-sdk-1.7.4.jar,/Users /student2020/app/hadoop273/share/hadoop/tools/lib/hadoop-aws-2.7.3.jar

Val Seq DF = ((. 1, "student1"), (2, "STUDENT2"), (. 3, "student3 ")). toDF (" the above mentioned id "," name ")

spark.sparkContext.hadoopConfiguration.set (" fs.s3a.access.key "," Minio ")
spark.sparkContext.hadoopConfiguration.set (" fs.s3a.secret .key "," minio123 ")
spark.sparkContext.hadoopConfiguration.set (" fs.s3a.endpoint "," 127.0.0.1:9000 "

spark.sparkContext.hadoopConfiguration.set("fs.s3a.connection.ssl.enabled", "false");
spark.sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
df.save("s3a://test/df2")

If you want to use nginx agents, the following may be added in the http {}


minion nginx 配置如下
server {
 listen 80; #或者443
 server_name file.example.com;#chang to yourself
 location / {
   proxy_buffering off; #important
   proxy_set_header Host $http_host;
   proxy_pass http://localhost:9000;
 }
}

 

Guess you like

Origin www.cnblogs.com/huaxiaoyao/p/12152284.html