部署流程:
- 下载spark环境包https://www.apache.org/dyn/closer.lua/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
- 切换目录至根目录:cd spark-2.4.4-bin-hadoop2.7
- builddocker应用程序镜像及push至repo:docker build -t xxx/spark:2.4.4 -f kubernetes/dockerfiles/spark/Dockerfile . && docker push xxx/spark:2.4.4
- 创建aks serviceaccount: kubectl create serviceaccount spark
- 创建clusterrolebinding:kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
- 本地提交应用程序:
bin/spark-submit \
--master k8s://172.22.3.107:443 \
--deploy-mode cluster \
--conf spark.kubernetes.namespace=default \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=5 \
--conf spark.kubernetes.container.image=xxx/spark:2.4.0 \
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
- docker提交spark程序
docker run -it --rm .kube/config:/root/.kube/config linclaus/spark-submit /opt/spark/bin/spark-submit --master k8s://https://data-extra-data-extraction-5abb9d-e807019a.hcp.chinanorth2.cx.prod.service.azk8s.cn:443 --deploy-mode cluster --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf spark.kubernetes.namespace=default --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=1 --conf spark.kubernetes.container.image=linclaus/spark:2.4.4 local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.16.0/bin/linux/amd64/kubectl
https://mirrors.tuna.tsinghua.edu.cn/apache/spark/
https://www.apache.org/dyn/closer.lua/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-on-linux
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.16.0/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
curl -L https://aka.ms/InstallAzureCli | bash