Write your own database: the select query statement corresponds to the construction and execution of the query tree

First we need to patch the original code. When initializing the SelectScan structure, we need to pass in the UpdateScan interface object. But many times we need to pass in the Scan object, so we need to do a conversion, that is, when initializing SelectScan, if The Scan object is passed in, then we encapsulate it into an UpdateScan interface object, so add a file named updatescan_wrapper.go in the query directory and enter the following content:

package query

import (
	"record_manager"
)

type UpdateScanWrapper struct {
    
    
	scan Scan
}

func NewUpdateScanWrapper(s Scan) *UpdateScanWrapper {
    
    
	return &UpdateScanWrapper{
    
    
		scan: s,
	}
}

func (u *UpdateScanWrapper) GetScan() Scan {
    
    
	return u.scan
}

func (u *UpdateScanWrapper) SetInt(fldName string, val int) {
    
    
	//DO NOTHING
}

func (u *UpdateScanWrapper) SetString(fldName string, val string) {
    
    
	//DO NOTHING
}

func (u *UpdateScanWrapper) SetVal(fldName string, val *Constant) {
    
    
	//DO NOTHING
}

func (u *UpdateScanWrapper) Insert() {
    
    
	//DO NOTHING
}

func (u *UpdateScanWrapper) Delete() {
    
    
	//DO NOTHING
}

func (u *UpdateScanWrapper) GetRid() *record_manager.RID {
    
    
	return nil
}

func (u *UpdateScanWrapper) MoveToRid(rid *record_manager.RID) {
    
    
	// DO NOTHING
}

The logic of the above code is simple. If the Scan object interface is called, it directly calls the interface of its Scan internal object. If the interface of UpdateScan is called, then it does nothing. After completing the above code, we make some modifications in select_plan.go:

func (s *SelectPlan) Open() interface{
    
    } {
    
    
	scan := s.p.Open()
	updateScan, ok := scan.(query.UpdateScan)
	if !ok {
    
    
		updateScanWrapper := query.NewUpdateScanWrapper(scan.(query.Scan))
		return query.NewSelectionScan(updateScanWrapper, s.pred)
	}
	return query.NewSelectionScan(updateScan, s.pred)
}

When the above code creates a SelectScan object, it first determines whether the incoming object can be type-converted to UpdateScan. If not, it means that sp.p.Open obtains a Scan object, so we use the previous code to encapsulate it and then use it to create a SelectScan object. After completing the modifications here, we get to the point.

Previously, after we implemented the sql parser, we would create a QueryData object after parsing a query statement. In this section, we will look at how to build a suitable query planner (Plan) based on this object. We will adopt the principle from simplicity to responsibility. First, we directly construct the QueryData information to construct the query planning object. At this time, we do not consider whether the query tree it constructs is optimized enough. Later, we will slowly improve the construction algorithm until the algorithm can Construct a sufficiently optimized query tree.

Let's look at a specific example first. Suppose we now have two tables STUDENT and EXAM. The first table contains two fields: student id and name:

id name
1 Tom
2 Jim
3 John

The second table contains student ID, subject name, and exam flight:

stud exam grad
1 math A
1 algorithm B
2 writing C
2 physics C
3 chemical B
3 english C

Now we use sql statement to query all students who got A in the exam:

select name from STUDENT, EXAM where id = student_id and grad='A'

When the SQL interpreter reads the above statement, it will create a QueryData structure, which contains the names of two tables, namely STUDENT and EXAM. Since these two tables are not views, the above code determines that if viewDef != nil is not true, so it enters the else part, that is, the code will create corresponding TablePlan objects for these two tables, and then directly execute Product on these two tables. The operation is to combine a row of the left table with each row of the right table to form a row of the new table. The result of the Product operation after the STUDENT and EXAM tables is as follows:

id name student_id exam grad
1 Tom 1 math A
1 Tom 1 algorithm B
1 Tom 2 writing A
1 Tom 2 physics C
1 Tom 3 chemical B
1 Tom 3 english A

The next code creates a ScanSelect object on the above table, then gets each row of the table, and then checks whether the id field of the row is the same as the student_id field. If it is the same, then check its grad field. If the field is 'A', The name field of the row is displayed.

Let's see how to use code to implement the process described above. First, we define the interface and add the following content to the interface.go file in the Planner directory:

type QueryPlanner interface {
    
    
	CreatePlan(data *query.QueryData, tx tx.Transaction) Plan
}

Then create the file query_planner.go in the Planner directory, and enter the following code. The implementation logic of the code will be explained in the following article:

package planner

import (
	"metadata_management"
	"parser"
	"tx"
)

type BasicQueryPlanner struct {
    
    
	mdm *metadata_management.MetaDataManager
}

func CreateBasicQueryPlanner(mdm *metadata_management.MetaDataManager) QueryPlanner {
    
    
	return &BasicQueryPlanner{
    
    
		mdm: mdm,
	}
}

func (b *BasicQueryPlanner) CreatePlan(data *parser.QueryData, tx *tx.Transaction) Plan {
    
    
	//1,直接创建 QueryData 对象中的表
	plans := make([]Plan, 0)
	tables := data.Tables()
	for _, tblname := range tables {
    
    
		//获取该表对应视图的 sql 代码
		viewDef := b.mdm.GetViewDef(tblname, tx)
		if viewDef != nil {
    
    
			//直接创建表对应的视图
			parser := parser.NewSQLParser(viewDef)
			viewData := parser.Query()
			//递归的创建对应表的规划器
			plans = append(plans, b.CreatePlan(viewData, tx))
		} else {
    
    
			plans = append(plans, NewTablePlan(tx, tblname, b.mdm))
		}
	}

	//将所有表执行 Product 操作,注意表的次序会对后续查询效率有重大影响,但这里我们不考虑表的次序,只是按照
	//给定表依次执行 Product 操作,后续我们会在这里进行优化
	p := plans[0]
	plans = plans[1:]

	for _, nextPlan := range plans {
    
    
		p = NewProductPlan(p, nextPlan)
	}

	p = NewSelectPlan(p, data.Pred())

	return NewProjectPlan(p, data.Fields())
}

In the above code, QueryData is the object generated by the parser after parsing the select statement. Its Tables array contains the tables to be queried by the select statement, so the CreatePlan function of the above code first obtains the tables to be queried by the select statement from the QueryData object, and then uses traversal For these tables, use NewProductPlan to create the Product operations corresponding to these tables. Finally, we create SelectPlan based on the Product. Here we are equivalent to using the conditions in the where statement to select the rows that meet the conditions based on the Product operation. Finally Create ProjectPlan again and select the required fields based on the selected rows.

Let's test the effect of the above code. First, in main.go, we construct the two tables student and exam. The code is as follows:

func createStudentTable() (*tx.Transation, *metadata_manager.MetaDataManager) {
    
    
	file_manager, _ := fm.NewFileManager("student", 2048)
	log_manager, _ := lm.NewLogManager(file_manager, "logfile.log")
	buffer_manager := bmg.NewBufferManager(file_manager, log_manager, 3)
	tx := tx.NewTransation(file_manager, log_manager, buffer_manager)
	sch := record_manager.NewSchema()
	mdm := metadata_manager.NewMetaDataManager(false, tx)

	sch.AddStringField("name", 16)
	sch.AddIntField("id")
	layout := record_manager.NewLayoutWithSchema(sch)

	ts := query.NewTableScan(tx, "student", layout)
	ts.BeforeFirst()
	for i := 1; i <= 3; i++ {
    
    
		ts.Insert() //指向一个可用插槽
		ts.SetInt("id", i)
		if i == 1 {
    
    
			ts.SetString("name", "Tom")
		}
		if i == 2 {
    
    
			ts.SetString("name", "Jim")
		}
		if i == 3 {
    
    
			ts.SetString("name", "John")
		}
	}

	mdm.CreateTable("student", sch, tx)

	exam_sch := record_manager.NewSchema()

	exam_sch.AddIntField("stuid")
	exam_sch.AddStringField("exam", 16)
	exam_sch.AddStringField("grad", 16)
	exam_layout := record_manager.NewLayoutWithSchema(exam_sch)

	ts = query.NewTableScan(tx, "exam", exam_layout)
	ts.BeforeFirst()

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 1)
	ts.SetString("exam", "math")
	ts.SetString("grad", "A")

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 1)
	ts.SetString("exam", "algorithm")
	ts.SetString("grad", "B")

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 2)
	ts.SetString("exam", "writing")
	ts.SetString("grad", "C")

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 2)
	ts.SetString("exam", "physics")
	ts.SetString("grad", "C")

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 3)
	ts.SetString("exam", "chemical")
	ts.SetString("grad", "B")

	ts.Insert() //指向一个可用插槽
	ts.SetInt("stuid", 3)
	ts.SetString("exam", "english")
	ts.SetString("grad", "C")

	mdm.CreateTable("exam", exam_sch, tx)

	return tx, mdm
}

Then we use the parser to parse the select query statement to generate a QueryData object. Finally, we use BasicQueryPlanner to create the execution tree and the corresponding Scan interface object. Finally, we call the Next interface of the Scan object to obtain the given field. The code is as follows:

func main() {
    
    
	//构造 student 表
	tx, mdm := createStudentTable()
	queryStr := "select name from student, exam where id = stuid and grad=\"A\""
	p := parser.NewSQLParser(queryStr)
	queryData := p.Query()
	test_planner := planner.CreateBasicQueryPlanner(mdm)
	test_plan := test_planner.CreatePlan(queryData, tx)
	test_interface := (test_plan.Open())
	test_scan, _ := test_interface.(query.Scan)
	for test_scan.Next() {
    
    
		fmt.Printf("name: %s\n", test_scan.GetString("name"))
	}

}

The results obtained after running the above code are as follows:
Please add image description
From the running results, we can see that the code successfully executed the SQL statement and returned the required fields. Interested students please search coding Disney on station B and view my debugging demonstration process through video, so that you can have a better understanding of the design of the code. Code download:
Link: https ://pan.baidu.com/s/16ftSp46cU5NLisScq-ftZg Extraction code: js99

Guess you like

Origin blog.csdn.net/tyler_download/article/details/134971351