OtterTune source parsing

In order to facilitate the face ottertune for magic (Hu) modified (gao), you need to find out its source and pipeline structure

 

OtterTune divided into two parts:

  server side: comprises a MySQL database (for storing tuning data, for use ml model), Django (FrontEnd User Interface), Celery (scheduling ML task);

  client side: Target_DBMS (storing user data traffic, to support multiple DBMS), Controller (for controlling the target DBMS), Driver (Controller for calling, the inlet is fabfile.py)

 

Each operation code (S1-S4, C1-C6 ) refer Meaning: https://www.cnblogs.com/pdev/p/10903628.html

 

                                                                             Client Side                                                                             

1. Driver

Driver is the entrance of the user running the client. The user does not directly execute controller commands, but it is controlled by the driver.

Driver using Python's fabric library to write. On the inside preset many common commands (such as a switch target DBMS, run oltpbench etc.).

fabfile.py

This is the core of Driver. Client side in the last run operation (C6) will need to call up the file.

在C6操作中,需要用fab loop和fab run_loops来让client周期性在target DBMS上采集knob/metric sample(In each loop, it collects target DBMS info, uploads to the server, gets new recommended configuration, installs the config and restarts DBMS. Users can continue to run loops until they are satisfied with the recommended configuration)。

  fab loop               runs one single loop. 

 1 @task
 2 def loop():
 3     # free cache, clean Linux PageCache 
 4     free_cache()
 5 
 6     # remove oltpbench log and controller log
 7     clean_logs()
 8 
 9     # restart database, shell "sudo service postgresql restart"
10     restart_database()
11 
12     # check whether there are enough free space on disk
13     if check_disk_usage() > MAX_DISK_USAGE:
14         LOG.WARN('Exceeds max disk usage %s', MAX_DISK_USAGE)
15 
16     # run controller as another process. Run the following command line in "../controller" folder: 
17     # sudo gradle run -PappArgs="-c CONF_controller_config -d output/" --no-daemon > CONF_controller_log
18     p = Process(target=run_controller, args=())
19     p.start()
20     LOG.info('Run the controller')
21 
22     # check whether the controller is ready(has created the log files)
23     while not _ready_to_start_oltpbench():
24         pass
25     # run oltpbench as a background job. Run the following command line in CONF_oltpbench_home folder: 
26     # ./oltpbenchmark -b CONF_oltpbench_workload -c CONF_oltpbench_config --execute=true -s 5 -o outputfile > CONF_oltpbench_log 2>&1 &
27     run_oltpbench_bg()
28     LOG.info('Run OLTP-Bench')
29 
30     # the controller starts the first collection
31 
32     # check whether 'Warmup complete, starting measurements' is in CONF_oltpbench_log file
33     while not _ready_to_start_controller():
34         pass
35     # shell 'sudo kill -2 CTL_PID'
36     # shutdown the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt'
37     signal_controller()
38     LOG.info('Start the first collection')
39 
40     # stop the experiment    
41 
42     # check whether 'Output Raw data into file' is in CONF_oltpbench_log file
43     while not _ready_to_shut_down_controller():
44         pass
45     # shell 'sudo kill -2 CTL_PID'
46     # shutdown the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt'
47     signal_controller()
48     LOG.info('Start the second collection, shut down the controller')
49 
50     p.join()
51 
52     # add user defined target objective
53     # add_udf()
54 
55     # save result file: 'knobs.json', 'metrics_after.json', 'metrics_before.json', 'summary.json'
56     save_dbms_result()
57 
58     # upload result to Django web interface
59     upload_result()
60 
61     # get result
62     # shell 'python3 ../../script/query_and_get.py CONF_upload_url CONF_upload_code 5'
63     get_result()
64 
65     # change target DBMS config
66     # shell 'sudo python3 PostgresConf.py next_config CONF_database_conf'
67     change_conf()

 

 fab run_loops:max_iter=10    runs 10 loops. You can set max_iter to change the maximum iterations.

 1 # intervals of restoring the databse
 2 RELOAD_INTERVAL = 10
 3 
 4 @task
 5 def run_loops(max_iter=1):
 6     # dump database if it's not done before.
 7     # shell 'PGPASSWORD=CONF_password pg_dump -U CONF_username -F c -d CONF_database_name > CONF_database_save_path/CONF_database_name.dump'
 8     dump = dump_database()
 9 
10     for i in range(int(max_iter)):
11         # restore database every RELOAD_INTERVAL
12         # shell these operations:
13         #     PGPASSWORD=CONF_password dropdb -e --if-exists CONF_database_name -U CONF_username
14         #     PGPASSWORD=CONF_password createdb -e CONF_database_name -U CONF_username
15         #     PGPASSWORD=CONF_password pg_restore -U CONF_username -j 8 -F c -d CONF_database_name CONF_database_save_path/CONF_database_name.dump
16         if RELOAD_INTERVAL > 0:
17             if i % RELOAD_INTERVAL == 0:
18                 if i == 0 and dump is False:
19                     restore_database()
20                 elif i > 0:
21                     restore_database()
22 
23         LOG.info('The %s-th Loop Starts / Total Loops %s', i + 1, max_iter)
24         loop()
25         LOG.info('The %s-th Loop Ends / Total Loops %s', i + 1, max_iter)

 

 

                                                                             Server Side                                                                             

1. Driver

hhhhhhhhh

 

Guess you like

Origin www.cnblogs.com/pdev/p/10948322.html