Add new sp_instr to GreatSQL sp to introduce bug analysis
1. Problem discovery
The sp used in a development needs to add new sp_instr to meet the demand. However, after adding several sp_instr, it is found that core will occur when executing the new sp.
Note: GreatSQL 8.0.32-25 is used this time.
1. Add 10 new sp_instr to the init_sp_psi_keys() code of sp_head.cc:
void init_sp_psi_keys() {
mysql_statement_register(category, &sp_instr_stmt1::psi_info, 1);
mysql_statement_register(category, &sp_instr_stmt2::psi_info, 1);
mysql_statement_register(category, &sp_instr_stmt3::psi_info, 1);
......
mysql_statement_register(category, &sp_instr_stmt10::psi_info, 1);
}
2. Add new sp_instr_stmt related implementation code to sp_instr.cc, among which sql_yacc.yy and sql_lex.cc need to add new syntax accordingly.
3. sp_rcontext.h is located in class sp_rcontext and adds several new member variables. The following code is just an example and has no practical value.
Field *m_return_value_fld_tmp{m_return_value_fld};
Field *m_return_value_fld_tmp1{m_return_value_fld};
Field *m_return_value_fld_tmp2{m_return_value_fld};
4. Create a new sp, which contains the contents of the new sp_instr_stmt, and then call the sp. It turns out that the code logic is cleared because the value of member in a list is cleared, which leads to a crash. Below is the relevant stack. Because code confidentiality is involved, only the relevant stacks of the open source part are taken.
#0 0x0000555558f3f3d9 in base_list_iterator::next_fast (this=0x7fffe01e9de0)
at /sql/sql_list.h:371
#1 0x0000555558fc59b7 in List_iterator_fast<Create_field>::operator++ (this=0x7fffe01e9de0)
at /sql/sql_list.h:605
#2 0x0000555559753ea2 in create_tmp_table_from_fields (thd=0x7fff20001050, field_list=...,
is_virtual=false, select_options=0, alias=0x0)
at /sql/sql_tmp_table.cc:2131
#3 0x0000555559084a09 in Item_xx::val_str (this=0x7fff20b673c8)
at /sql/item_func.cc:10796
#4 0x0000555558fa408b in Item::save_in_field_inner (this=0x7fff20b673c8, field=0x7fff20b9b1a8,
no_conversions=false) at /sql/item.cc:8202
#5 0x0000555558fa3c43 in Item::save_in_field (this=0x7fff20b673c8, field=0x7fff20b9b1a8,
no_conversions=false) at /sql/item.cc:8144
#6 0x0000555559400322 in sp_eval_expr (thd=0x7fff20001050, result_field=0x7fff20b9b1a8,
expr_item_ptr=0x7fff20b67620) at /sql/sp.cc:3613
#7 0x000055555943b1d1 in sp_rcontext::set_variable (this=0x7fff20b85d80, thd=0x7fff20001050,
field=0x7fff20b9b1a8, value=0x7fff20b67620)
at /sql/sp_rcontext.cc:1023
#8 0x0000555558fc3a8e in sp_rcontext::set_variable (this=0x7fff20b85d80, thd=0x7fff20001050,
var_idx=1, value=0x7fff20b67620)
at /sql/sp_rcontext.h:176
打印crash处的信息,发现list里面的值被清空了。
(gdb) p tmp
$1 = (list_node *) 0x0
2. Problem investigation process
1. Carefully check the code and find that there is no problem with the code logic. The values of the list are indeed successfully assigned, but when running, the list is found to be cleared. Obviously, this is a memory leak or memory overflow elsewhere that causes the element space of the list to be occupied or deleted. Empty. If the sp code is replaced with something else, sometimes it will crash and sometimes it won't. The triggering mechanism is not clear, and I don't know which specific line of code caused the memory leak.
2. So I went back and looked at the place where I first added the code. I guessed it was related to the fact that I added 10 sp_instr_stmt. Because the related array or memory was not expanded, it was very likely that this would cause a memory overflow.
3. After locating the suspected problem area, you can start investigating the relevant code. View the code related to adding sp_instr.
添加sp_instr实现代码如下:
mysql_statement_register(category, &sp_instr_stmt1::psi_info, 1);
于是继续往下面调查mysql_statement_register实现的代码,看到这里果然用到了statement_class_max:
PFS_statement_key register_statement_class(const char *name, uint name_length,
PSI_statement_info *info) {
/* See comments in register_mutex_class */
uint32 index;
PFS_statement_class *entry;
REGISTER_CLASS_BODY_PART(index, statement_class_array, statement_class_max,
name, name_length)
接着查看statement_class_max的赋值的地方:
int init_statement_class(uint statement_class_sizing) {
int result = 0;
statement_class_dirty_count = statement_class_allocated_count = 0;
statement_class_max = statement_class_sizing;
通过搜索代码查到statement_class_sizing相关的参数配置的地方,看到这里有一个SP_PSI_STATEMENT_INFO_COUNT宏定义,这个值跟sp_instr的数量有关。
static Sys_var_ulong Sys_pfs_max_statement_classes(
"performance_schema_max_statement_classes",
"Maximum number of statement instruments.",
READ_ONLY GLOBAL_VAR(pfs_param.m_statement_class_sizing),
CMD_LINE(REQUIRED_ARG), VALID_RANGE(0, 256),
DEFAULT((ulong)SQLCOM_END + (ulong)COM_END + 5 +
SP_PSI_STATEMENT_INFO_COUNT + CLONE_PSI_STATEMENT_COUNT),
BLOCK_SIZE(1), PFS_TRAILING_PROPERTIES);
继续全文搜索,发现在sp_head.h定义了,这里的值为16,数了一下现存的sp_instr个数刚好为16个,至此问题原因发现,因为我加了10个sp_instr,而这个宏定义的值没有跟着增加,导致内存溢出。
#define SP_PSI_STATEMENT_INFO_COUNT 16
3. Problem solutions
After parsing the above code, you can modify the relevant problem code by making the following modifications. After recompiling, the problem is solved.
sp_head.h修改SP_PSI_STATEMENT_INFO_COUNT宏定义:
#define SP_PSI_STATEMENT_INFO_COUNT 26
因为增加了Sys_pfs_max_statement_classes的default值,因为相关配置范围也要跟着增加,因此把range相应加大。
static Sys_var_ulong Sys_pfs_max_statement_classes(
"performance_schema_max_statement_classes",
"Maximum number of statement instruments.",
READ_ONLY GLOBAL_VAR(pfs_param.m_statement_class_sizing),
CMD_LINE(REQUIRED_ARG), VALID_RANGE(0, 256 * 2),
DEFAULT((ulong)SQLCOM_END + (ulong)COM_END + 5 +
SP_PSI_STATEMENT_INFO_COUNT + CLONE_PSI_STATEMENT_COUNT),
BLOCK_SIZE(1), PFS_TRAILING_PROPERTIES);
4. Problem summary
When adding a new sp_instr to GreatSQL's sp, you need to increase the corresponding parameter values accordingly to prevent memory overflow. If other functions need to be modified similarly, you must first carefully investigate whether there are related parameter configurations or macro definitions involved, otherwise You will encounter all kinds of inexplicable problems, and it takes a lot of time to investigate.
The problems discovered this time are bugs introduced by newly added functions. We should also pay attention to similar problems in actual development and application. If we are not careful, we will step into pitfalls.
The above problems also exist in MySQL/Percona.
Enjoy GreatSQL :)
About GreatSQL
GreatSQL is a domestic independent open source database suitable for financial-level applications. It has many core features such as high performance, high reliability, high ease of use, and high security. It can be used as an optional replacement for MySQL or Percona Server and is used in online production environments. , completely free and compatible with MySQL or Percona Server.
Related links: GreatSQL Community Gitee GitHub Bilibili
GreatSQL Community:
Community reward suggestions and feedback: https://greatsql.cn/thread-54-1-1.html
Community blog prize-winning submission details: https://greatsql.cn/thread-100-1-1.html
(If you have any questions about the article or have unique insights, you can go to the official community website to ask or share them~)
Technical exchange group:
WeChat & QQ group:
QQ group: 533341697
WeChat group: Add GreatSQL Community Assistant (WeChat ID: wanlidbc
) as a friend, and wait for the community assistant to add you to the group.