Nebula Graph Source Code Interpretation Series|Vol.05 Scheduler and Executor brothers

This article was first published on the Nebula Graph Community public account

Nebula Graph Source Code Interpretation Series|Vol.05 Scheduler and Executor brothers

In the last article, we described the content of the Query Engine Optimizer. In this article, we will explain the remaining Scheduler and Executor parts of the Query Engine.

Overview

In the execution phase, the execution engine converts the physical execution plan generated by the Planner into a series of Executors through the Scheduler (scheduler) to drive the execution of the Executors. Executor, the executor, each PlanNode in the physical execution plan corresponds to an Executor.

Source code positioning

The source code of the scheduler is in the src/schedulerdirectory :

src/scheduler
├── AsyncMsgNotifyBasedScheduler.cpp
├── AsyncMsgNotifyBasedScheduler.h
├── CMakeLists.txt
├── Scheduler.cpp
└── Scheduler.h

The Scheduler abstract class defines the public interface of the scheduler, which can be inherited to implement various schedulers. The AsyncMsgNotifyBasedScheduler scheduler is currently implemented, which avoids stack overflow based on asynchronous message communication and breadth-first search. The source code of the executor is in the src/executordirectory :

src/executor
├── admin
├── algo
├── CMakeLists.txt
├── ExecutionError.h
├── Executor.cpp
├── Executor.h
├── logic
├── maintain
├── mutate
├── query
├── StorageAccessExecutor.cpp
├── StorageAccessExecutor.h
└── test

Implementation process

First, the scheduler starts from the root node of the execution plan by traversing the entire execution plan by using the breadth-first search algorithm and builds their message notification mechanism according to the execution dependencies between nodes. During execution, each node will be scheduled for execution after receiving the message that all its dependent nodes have been executed. Once its own execution is completed, it will send a message to the node that depends on itself until the entire plan is executed.

void AsyncMsgNotifyBasedScheduler::runExecutor(
    std::vector<folly::Future<Status>>&& futures,
    Executor* exe,
    folly::Executor* runner,
    std::vector<folly::Promise<Status>>&& promises) const {
    folly::collect(futures).via(runner).thenTry(
        [exe, pros = std::move(promises), this](auto&& t) mutable {
            if (t.hasException()) {
                return notifyError(pros, Status::Error(t.exception().what()));
            }
            auto status = std::move(t).value();
            auto depStatus = checkStatus(std::move(status));
            if (!depStatus.ok()) {
                return notifyError(pros, depStatus);
            }
            // Execute in current thread.
            std::move(execute(exe)).thenTry(
                [pros = std::move(pros), this](auto&& exeTry) mutable {
                    if (exeTry.hasException()) {
                        return notifyError(pros, Status::Error(exeTry.exception().what()));
                    }
                    auto exeStatus = std::move(exeTry).value();
                    if (!exeStatus.ok()) {
                        return notifyError(pros, exeStatus);
                    }
                    return notifyOK(pros);
                });
        });
}

Each Executor will go through four stages of create-open-execute-close:

create

Generate the corresponding Executor according to the node type.

open

Do some initialization operations before the Executor is officially executed, as well as slow query termination and memory water level judgment. Nebula supports manually killcanceling the execution of a query statement, so each Executor needs to check the current execution plan status before executing killedit, and terminate the execution if it is marked. Before each Query type Executor is executed, it is also necessary to check whether the memory occupied by the current system reaches the memory water level. If the memory water level is reached, the execution is terminated, which can avoid OOM to a certain extent.

Status Executor::open() {
    if (qctx_->isKilled()) {
        VLOG(1) << "Execution is being killed. session: " << qctx()->rctx()->session()->id()
            << "ep: " << qctx()->plan()->id()
            << "query: " << qctx()->rctx()->query();
        return Status::Error("Execution had been killed");
    }
    auto status = MemInfo::make();
    NG_RETURN_IF_ERROR(status);
    auto mem = std::move(status).value();
    if (node_->isQueryNode() && mem->hitsHighWatermark(FLAGS_system_memory_high_watermark_ratio)) {
        return Status::Error(
            "Used memory(%ldKB) hits the high watermark(%lf) of total system memory(%ldKB).",
            mem->usedInKB(),
            FLAGS_system_memory_high_watermark_ratio,
            mem->totalInKB());
    }
    numRows_ = 0;
    execTime_ = 0;
    totalDuration_.reset();
    return Status::OK();
}

execute

The input and output of the Executor of type Query is a table (DataSet). The execution of Executor is based on the iterator model: each time the calculation is performed, the iterator next()method , a row of data is obtained, and the calculation is performed until the input table is traversed. The result of the calculation constitutes a new table, which is output to the subsequent Executor as output.

folly::Future<Status> ProjectExecutor::execute() {
    SCOPED_TIMER(&execTime_);
    auto* project = asNode<Project>(node());
    auto columns = project->columns()->columns();
    auto iter = ectx_->getResult(project->inputVar()).iter();
    DCHECK(!!iter);
    QueryExpressionContext ctx(ectx_);

    VLOG(1) << "input: " << project->inputVar();
    DataSet ds;
    ds.colNames = project->colNames();
    ds.rows.reserve(iter->size());
    for (; iter->valid(); iter->next()) {
        Row row;
        for (auto& col : columns) {
            Value val = col->expr()->eval(ctx(iter.get()));
            row.values.emplace_back(std::move(val));
        }
        ds.rows.emplace_back(std::move(row));
    }
    VLOG(1) << node()->outputVar() << ":" << ds;
    return finish(ResultBuilder().value(Value(std::move(ds))).finish());
}

If the input table of the current Executor will not be used as input by other Executors, the memory used by these input tables will be dropped during the execution phase to reduce memory usage.

void Executor::drop() {
    for (const auto &inputVar : node()->inputVars()) {
        if (inputVar != nullptr) {
            // Make sure use the variable happened-before decrement count
            if (inputVar->userCount.fetch_sub(1, std::memory_order_release) == 1) {
                // Make sure drop happened-after count decrement
                CHECK_EQ(inputVar->userCount.load(std::memory_order_acquire), 0);
                ectx_->dropResult(inputVar->name);
                VLOG(1) << "Drop variable " << node()->outputVar();
            }
        }
    }
}

close

After the Executor is executed, some collected execution information such as execution time, the number of rows in the output table, etc. are added to the profiling stats. Users can view these statistics in the execution plan displayed after a profile statement.


Execution Plan (optimize time 141 us)

-----+------------------+--------------+-----------------------------------------------------+--------------------------------------
| id | name             | dependencies | profiling data                                      | operator info                       |
-----+------------------+--------------+-----------------------------------------------------+--------------------------------------
|  2 | Project          | 3            | ver: 0, rows: 56, execTime: 147us, totalTime: 160us | outputVar: [                        |
|    |                  |              |                                                     |   {                                 |
|    |                  |              |                                                     |     "colNames": [                   |
|    |                  |              |                                                     |       "VertexID",                   |
|    |                  |              |                                                     |       "player.age"                  |
|    |                  |              |                                                     |     ],                              |
|    |                  |              |                                                     |     "name": "__Project_2",          |
|    |                  |              |                                                     |     "type": "DATASET"               |
|    |                  |              |                                                     |   }                                 |
|    |                  |              |                                                     | ]                                   |
|    |                  |              |                                                     | inputVar: __TagIndexFullScan_1      |
|    |                  |              |                                                     | columns: [                          |
|    |                  |              |                                                     |   "$-.VertexID AS VertexID",        |
|    |                  |              |                                                     |   "player.age"                      |
|    |                  |              |                                                     | ]                                   |
-----+------------------+--------------+-----------------------------------------------------+--------------------------------------
|  3 | TagIndexFullScan | 0            | ver: 0, rows: 56, execTime: 0us, totalTime: 6863us  | outputVar: [                        |
|    |                  |              |                                                     |   {                                 |
|    |                  |              |                                                     |     "colNames": [                   |
|    |                  |              |                                                     |       "VertexID",                   |
|    |                  |              |                                                     |       "player.age"                  |
|    |                  |              |                                                     |     ],                              |
|    |                  |              |                                                     |     "name": "__TagIndexFullScan_1", |
|    |                  |              |                                                     |     "type": "DATASET"               |
|    |                  |              |                                                     |   }                                 |
|    |                  |              |                                                     | ]                                   |
|    |                  |              |                                                     | inputVar:                           |
|    |                  |              |                                                     | space: 318                          |
|    |                  |              |                                                     | dedup: false                        |
|    |                  |              |                                                     | limit: 9223372036854775807          |
|    |                  |              |                                                     | filter:                             |
|    |                  |              |                                                     | orderBy: []                         |
|    |                  |              |                                                     | schemaId: 319                       |
|    |                  |              |                                                     | isEdge: false                       |
|    |                  |              |                                                     | returnCols: [                       |
|    |                  |              |                                                     |   "_vid",                           |
|    |                  |              |                                                     |   "age"                             |
|    |                  |              |                                                     | ]                                   |
|    |                  |              |                                                     | indexCtx: [                         |
|    |                  |              |                                                     |   {                                 |
|    |                  |              |                                                     |     "columnHints": [],              |
|    |                  |              |                                                     |     "index_id": 325,                |
|    |                  |              |                                                     |     "filter": ""                    |
|    |                  |              |                                                     |   }                                 |
|    |                  |              |                                                     | ]                                   |
-----+------------------+--------------+-----------------------------------------------------+--------------------------------------
|  0 | Start            |              | ver: 0, rows: 0, execTime: 1us, totalTime: 19us     | outputVar: [                        |
|    |                  |              |                                                     |   {                                 |
|    |                  |              |                                                     |     "colNames": [],                 |
|    |                  |              |                                                     |     "type": "DATASET",              |
|    |                  |              |                                                     |     "name": "__Start_0"             |
|    |                  |              |                                                     |   }                                 |
|    |                  |              |                                                     | ]                                   |
-----+------------------+--------------+-----------------------------------------------------+--------------------------------------  

Above, the source code parsing Query Engine related modules have been explained, and some features will be explained later.

Exchange graph database technology? To join the Nebula exchange group, please fill in your Nebula business card first , and the Nebula assistant will pull you into the group~~

[Event] Nebula Hackathon 2021 is underway, let's explore the unknown together and receive ¥ 150,000 prize →→ https://nebula-graph.com.cn/hackathon/

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324213097&siteId=291194637