Using the REST interface to call Spark - Apache Livy usage notes

0x0 Livy installation and operation

Log in to the official website: http://livy.incubator.apache.org/
to download the latest version of livy.
1. Unzip
2. Configuration: Add in conf/livy-env.sh:

export SPARK_HOME=path/to/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
  1. Enter the bin file to execute
#前台模式,可观察程序运行日志
./livy-server
#后台模式
./livy-server start

Start the service.
Note: If an error is reported that the log file cannot be created, create the path to the corresponding log file and restart livy.

0x1 use

When the livy service is successfully started, open the browser and enter:
localhost:8998
to log in to the livy page, you can see the following page:
write picture description here

livy supports two task submission schemes:
1. Interactive: To put it bluntly, it is to send the statement originally executed in spark-shell to the livy server through http request, and then livy starts spark-shell on the server to execute you Sent statement;
2. Batch processing: To put it bluntly, to help you do the spark-submit work, and also send the parameters to the livy server through the http request.

1.1 Interactive usage

The tool uses postman, first create a new session (in fact, start a spark application): the request method is post, and the request URL is livy-ip:8998/sessions
The request body is similar to the following format:

{
    "kind": "spark",
    "conf" : {
        "spark.cores.max" : 4,
        "spark.executor.memory" : "512m"
    }
}

write picture description here
It is to use postman to send a post request. The request body is in json format. In the json, you can specify the appName, how much memory to apply, how many CPUs to apply, and other information.
Then you will receive a response:
write picture description here
you can see that a session is opened, and his id=1.
Next, we can send a scala statement to the session for execution, the request method is post, and the request URL is livy-ip:8998/sessions/{id}/statements:
write picture description here
we can get a response:
write picture description here
you can see that the id=0 here, It refers to the id of the statments
and then sends another request to view the output. The request method is get, and the request url is livy-ip:8998/sessions/1/statements/0:
write picture description here
you can get the response:
write picture description here
among them, the data in the output is the output For specific data, progress refers to the progress. Sometimes it is found that the progress is less than 1, indicating that the task has not been completed.

To be continued

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326004969&siteId=291194637
Recommended