1. Software environment
- CDH 5.16.1
- Centos 7.6
- Scala 2.11.8
2. Preparation before installation
1. Download the Parcel package required by Spark 2.3
http://archive.cloudera.com/spark2/parcels/2.3.0.cloudera4/
Copy the three files to the /opt/cloudera/parcel-repo directory, if there are the same files, rename the previous files
2. Download the csd package of Spark
http://archive.cloudera.com/spark2/csd/
Copy SPARK2_ON_YARN-2.3.0.cloudera4.jar to the /opt/cloudera/csd directory
3. Restart CM and cluster
4. Install Spark
Click "Host" --> "parcel" --> "Check new parcel" --> "Assign" --> "Activate"
5. Problem
1. Start Spark2-shell and report an error
Solution: modify the default value of yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb to 2G