0x0 Livy installation and operation
Log in to the official website: http://livy.incubator.apache.org/
to download the latest version of livy.
1. Unzip
2. Configuration: Add in conf/livy-env.sh:
export SPARK_HOME=path/to/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
- Enter the bin file to execute
#前台模式,可观察程序运行日志
./livy-server
#后台模式
./livy-server start
Start the service.
Note: If an error is reported that the log file cannot be created, create the path to the corresponding log file and restart livy.
0x1 use
When the livy service is successfully started, open the browser and enter:
localhost:8998
to log in to the livy page, you can see the following page:
livy supports two task submission schemes:
1. Interactive: To put it bluntly, it is to send the statement originally executed in spark-shell to the livy server through http request, and then livy starts spark-shell on the server to execute you Sent statement;
2. Batch processing: To put it bluntly, to help you do the spark-submit work, and also send the parameters to the livy server through the http request.
1.1 Interactive usage
The tool uses postman, first create a new session (in fact, start a spark application): the request method is post, and the request URL is livy-ip:8998/sessions
The request body is similar to the following format:
{
"kind": "spark",
"conf" : {
"spark.cores.max" : 4,
"spark.executor.memory" : "512m"
}
}
It is to use postman to send a post request. The request body is in json format. In the json, you can specify the appName, how much memory to apply, how many CPUs to apply, and other information.
Then you will receive a response:
you can see that a session is opened, and his id=1.
Next, we can send a scala statement to the session for execution, the request method is post, and the request URL is livy-ip:8998/sessions/{id}/statements:
we can get a response:
you can see that the id=0 here, It refers to the id of the statments
and then sends another request to view the output. The request method is get, and the request url is livy-ip:8998/sessions/1/statements/0:
you can get the response:
among them, the data in the output is the output For specific data, progress refers to the progress. Sometimes it is found that the progress is less than 1, indicating that the task has not been completed.