How does openGemini develop a string operator?

For preliminary knowledge of the content of this article, please refer to the article: openGemini analysis: query engine framework

This article will teach you how to add a built-in function (operator) in the openGemini kernel. Here we take the hello operator as an example. Its function is to print "hello, $filed_value" , which is similar to this output:

I hope it can help everyone become familiar with the operator development process and lay a certain foundation for familiarity with the kernel source code.

Develop hello operator

Here, we will implement a HELLO() function, which has one parameter, which is a string. To do this you need to clone the openGemini/openGemini repository

docs.opengemini.org/zh/dev-guide/get_started/build_source_code.html)

Currently, executing  hello  operator query will report the following error:

> select hello("name") from mstERR: undefined function hello()

We officially start modifying the source code:

The first step is to add the name of the defined function, "hello", in open_src/influx/query/compile.go:

func isStringFunction(call *influxql.Call) bool {  switch call.Name {  case "str", "strlen", "substr", "hello":    return true  }

The next step is to modify engine/executor/schema.go and add "hello"

func (qs *QuerySchema) isStringFunction(call *influxql.Call) bool {  switch call.Name {  case "str", "strlen", "substr", "hello":    return true  }

Then modify open_src/influx/query/functions.go and add the CallType of "hello"

func (m StringFunctionTypeMapper) CallType(name string, _ []influxql.DataType) (influxql.DataType, error) {  switch name {  case "str":    return influxql.Boolean, nil  case "strlen":    return influxql.Integer, nil  case "substr":    return influxql.String, nil  case "hello":    return influxql.String, nil

Finally, we need to add the real implementation of our hello method. The code is in engine/executor/string_functions.go

func (v StringValuer) Call(name string, args []interface{}) (interface{}, bool) {  switch name {  case "strlen":   ......  case "hello":    if len(args) != 1 {      return nil, false    }    if arg0, ok := args[0].(string); ok {      return HelloFunc(arg0), true    }    return nil, true  default:    return nil, false}
func HelloFunc(srcStr string) string {    // 测试性能优化时放开下面注释  // var h []byte  // h = make([]byte, 200*1024*1024)  // fmt.Println(h)  return "hello, " + srcStr}

Now you need to rebuild openGemini and try out the newly added features. (https://docs.opengemini.org/zh/dev-guide/get_started/build_source_code.html)

Final Results:

> insert mst name="Tom"> SELECT HELLO(name) from mst+----------------------+------------------+| time                 | hello            |+----------------------+------------------+| 2021-08-16T16:00:00Z | hello, Tom       |+----------------------+------------------+

unit test

We need to test that the HelloFunc in engine/executor/string_functions.go is as expected: starting with hello.

In the engine/executor/string_function_test.go file, add the following test:

func TestStringFunctionHello(t *testing.T) {  stringValuer := executor.StringValuer{}  inputName := "hello"  inputArgs := []interface{}{"Alice", "Bob", "Carry"}  expects := []interface{}{"hello, Alice", "hello, Bob", "hello, Carry"}  outputs := make([]interface{}, 0, len(expects))  for _, arg := range inputArgs {    if out, ok := stringValuer.Call(inputName, []interface{}{arg}); ok {      outputs = append(outputs, out)    }  }  assert.Equal(t, outputs, expects)}

Integration Testing

If you need to add integration tests (https://docs.opengemini.org/zh/dev-guide/get_started/test_tutorials.html), please add the following test function in the tests/server_test.go file:

func TestServer_Query_Aggregate_For_Hello_Functions(t *testing.T) {  t.Parallel()  s := OpenServer(NewParseConfig(testCfgPath))  defer s.Close()
  if err := s.CreateDatabaseAndRetentionPolicy("db0", NewRetentionPolicySpec("rp0", 1, 0), true); err != nil {    t.Fatal(err)  }
  writes := []string{    fmt.Sprintf(`mst,country=china,name=azhu age=12.3,height=70i,address="shenzhen",alive=TRUE 1629129600000000000`),    fmt.Sprintf(`mst,country=american,name=alan age=20.5,height=80i,address="shanghai",alive=FALSE 1629129601000000000`),    fmt.Sprintf(`mst,country=germany,name=alang age=3.4,height=90i,address="beijin",alive=TRUE 1629129602000000000`),    fmt.Sprintf(`mst,country=japan,name=ahui age=30,height=121i,address="guangzhou",alive=FALSE 1629129603000000000`),    fmt.Sprintf(`mst,country=canada,name=aqiu age=35,height=138i,address="chengdu",alive=TRUE 1629129604000000000`),    fmt.Sprintf(`mst,country=china,name=agang age=48.8,height=149i,address="wuhan" 1629129605000000000`),    fmt.Sprintf(`mst,country=american,name=agan age=52.7,height=153i,alive=TRUE 1629129606000000000`),    fmt.Sprintf(`mst,country=germany,name=alin age=28.3,address="anhui",alive=FALSE 1629129607000000000`),    fmt.Sprintf(`mst,country=japan,name=ali height=179i,address="xian",alive=TRUE 1629129608000000000`),    fmt.Sprintf(`mst,country=canada age=60.8,height=180i,address="hangzhou",alive=FALSE 1629129609000000000`),    fmt.Sprintf(`mst,name=ahuang age=102,height=191i,address="nanjin",alive=TRUE 1629129610000000000`),    fmt.Sprintf(`mst,country=china,name=ayin age=123,height=203i,address="zhengzhou",alive=FALSE 1629129611000000000`),  }  test := NewTest("db0", "rp0")  test.writes = Writes{    &Write{data: strings.Join(writes, "\n")},  }
  test.addQueries([]*Query{    &Query{      name:    "SELECT hello(address)",      command: `SELECT hello("address") FROM db0.rp0.mst`,      exp:     `{"results":[{"statement_id":0,"series":[{"name":"mst","columns":["time","hello"],"values":[["2021-08-16T16:00:00Z","hello, shenzhen"],["2021-08-16T16:00:01Z","hello, shanghai"],["2021-08-16T16:00:02Z","hello, beijin"],["2021-08-16T16:00:03Z","hello, guangzhou"],["2021-08-16T16:00:04Z","hello, chengdu"],["2021-08-16T16:00:05Z","hello, wuhan"],["2021-08-16T16:00:07Z","hello, anhui"],["2021-08-16T16:00:08Z","hello, xian"],["2021-08-16T16:00:09Z","hello, hangzhou"],["2021-08-16T16:00:10Z","hello, nanjin"],["2021-08-16T16:00:11Z","hello, zhengzhou"]]}]}]}`,    },  }...)
  for i, query := range test.queries {    t.Run(query.name, func(t *testing.T) {      if i == 0 {        if err := test.init(s); err != nil {          t.Fatalf("test init failed: %s", err)        }      }      if query.skip {        t.Skipf("SKIP:: %s", query.name)      }      if err := query.Execute(s); err != nil {        t.Error(query.Error(err))      } else if !query.success() {        t.Error(query.failureMessage())      }    })  }}

Performance analysis (profiling) and optimization

(1) Performance analysis (profiling)

As with any database system, performance is always important. If you want to know where the performance bottlenecks are, you can use a powerful Go profiling tool called pprof.

When starting the process, you need to modify the configuration file to enable the pprof function on the SQL side. The port is 6061.

[http]pprof-enabled = true

Collect runtime analysis information via HTTP endpoint

Normally, when the openGemini server is running, it will be via HTTP at http://127.0.0.1:6061/debug/pprof/profile . You can get the profile results by running the following command:

curl -G "http://127.0.0.1:6061/debug/pprof/profile?seconds=45" > profile.profile go tool pprof -http 127.0.0.1:4001 profile.profile

These commands capture 45 seconds of profiling information, then enter 127.0.0.1:4001 in your browser to open a web view of the profiling CPU results. This view contains a flame graph of the execution and more views that can help you diagnose performance bottlenecks. ( https://www.brendangregg.com/flamegraphs.html )

You can also collect additional runtime information through this endpoint. For example:

  • goroutine

curl -G "http://127.0.0.1:6061/debug/pprof/goroutine" > goroutine.profilego tool trace -http 127.0.0.1:4001 goroutine.profile
  • trace(call chain)

curl -G "http://127.0.0.1:6061/debug/pprof/trace?seconds=3" > trace.profilego tool trace -http 127.0.0.1:4001 trace.profile
  • heap(memory)

curl -G "http://127.0.0.1:6061/debug/pprof/heap" > heap.profilego tool pprof -http 127.0.0.1:4001 heap.profile

To learn how to analyze runtime information, see Go's diagnostic documentation. (https://golang.org/doc/diagnostics)

 

Memory application flame graph:

HelloFunc, applied for 845MB of memory.

The optimized flame graph is as follows:

Basically, you can't see any special memory consumption, and only apply for a total of about 8MB .

(2) Performance optimization

Performance optimization methods generally include:

1. Optimize GC and reduce small object applications

2. Remove memory applications for useless objects

3. Allocate enough space for the cache content at one time and reuse it appropriately.

4. Use goroutine pool for high-concurrency task processing

5. Reduce the conversion between []byte and string, try to use []byte string processing

 

Summarize

We have successfully added  the hello  function to the openGemini kernel , and completed unit testing, integration testing, performance analysis and optimization. At this point, a complete development process has been experienced. This development rhythm is followed in most enterprise-level projects. I believe this article can be used as an introductory chapter for developing database projects and bring a little help to developers in developing more complex functions in the future .


openGemini official website: http://www.openGemini.org

openGemini open source address: https://github.com/openGemini

openGemini public account:

Welcome to pay attention~ We sincerely invite you to join the openGemini community to build, govern and share the future together!

The author of the open source framework NanUI switched to selling steel, and the project was suspended. The first free list in the Apple App Store is the pornographic software TypeScript. It has just become popular, why do the big guys start to abandon it? TIOBE October list: Java has the biggest decline, C# is approaching Java Rust 1.73.0 Released A man was encouraged by his AI girlfriend to assassinate the Queen of England and was sentenced to nine years in prison Qt 6.6 officially released Reuters: RISC-V technology becomes the key to the Sino-US technology war New battlefield RISC-V: Not controlled by any single company or country, Lenovo plans to launch Android PC
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3234792/blog/10117580