What is the schema in OceanBase?

Li Boyang

R&D engineer of OceanBase Technology Department.

In the OceanBase open source community, we often see questions similar to "What is schema?":

picture

Many students often mistakenly think that in OceanBase, schema is just a synonym for database. This time I will share a little discussion about what schema is.

First of all, let’s talk about the conclusion. Schema has different meanings in MySQL (OceanBase’s MySQL schema), Oracle (OceanBase’s Oracle schema), and OceanBase metadata management module.

The concept of schema in OceanBase MySQL schema

Schema is a synonym for Database. In SQL, you can use the schema keyword instead of the Database keyword, for example, use CREATE SCHEMA instead of CREATE DATABASE.

The concept of schema in OceanBase Oracle schema

In OceanBase's Oracle schema, schema refers to a collection of database objects owned by a user, used for permission management and namespace isolation. I personally understand it as a "user space". Schema objects refer to database objects in a certain schema, such as tables, views, indexes, etc. in the schema; non-schema objects refer to database objects that do not belong to a certain schema, such as users, roles, table spaces, etc.

Users will have a default schema when they are created, and their schema name is equal to the user name. If they have permission, users can also access and use other schemas. When accessing an object in a schema, if you do not specify which schema the object belongs to, the system will automatically add the default schema name to the object.

If the current user has permission to access or modify other schema objects, you can switch to other schema to perform various operations through alter session set current_schema = other_schema_name;.

The concept of schema in the OB metadata management module

picture

The schema in the Oceanbase metadata management module generally refers to all database object metainformation that needs to be synchronized within the cluster, including but not limited to table, database, user and other metainformation. In addition, Oceanbase's schema is multi-versioned, and the schema information in memory is ultimately consistent across the cluster.

What's in the schema?

After explaining what schema is, you will see people asking in the community that schema is meta-information, so what does meta-information contain?

picture

There is actually a small omission in the above answer, because the metadata information of various database objects will only be affected by DDL. The "estimated number of rows" is statistical information that is not affected by DDL and is only affected by DML, so in fact It is not meta-information of the table and will not be recorded in the table schema.

For details about what is included in the meta-information, please refer to the code under the src/share/schema path. For example, if you want to see which table metadata information is recorded in the table schema, just look at the class members of ObTableSchema and its parent class in ob_table_schema.h.

picture

DDL execution process

The above answers the question of what schema is and what it has. Because the schema will only be modified through DDL, the execution process of DDL is briefly mentioned here to facilitate troubleshooting when you encounter DDL-related problems.

DDL will not be processed by the optimizer, but will be sent to RootServer as a command and processed by RootServer (hereinafter referred to as rs). The execution process in OceanBase is as follows:

picture

Take the most common table creation statement as an example:

The create table command will resolve the table creation statement on obs, store the table creation information in create_table_arg, and send create_table_arg to rpc. rs will then perform the following operations:

  • Check whether the schema version used by obs when resolving is the latest (solve using optimistic locking method, if not the latest, retry the DDL as a whole);

  • Obtain a monotonically increasing new table id within a tenant from __all_sys_stat 

  • Insert the information provided in create_table_arg into internal tables such as __all_table_history for persistence

  • Record the change log of ddl in __all_ddl_operation (for scenarios such as incremental refresh)

  • publish schema (notify each node to flush the schema into memory)

picture

After other observers receive the publish schema command sent by RS, they will incrementally load the changes in the internal table into the memory (schema cache). This is what we often hear others say "refreshing the schema".

The ddl_service on rs calls publish_schema () to broadcast the new schema version number to all obs. What actually happens?

The obs where rs is located directly calls refresh_schema.

Send the switch_schema command to each alive obs, and the parameter is the latest schema_version.

After each obs receives the instruction, it generates an ObSchemaRefreshTask asynchronous refresh task, and refreshes its schema to the latest through this line of task.

picture

Attached is another picture:

  • The upper part of the figure is executing DDL. The DDL service of RS will be responsible for writing internal tables and notifying each observer node to load metadata modifications into the schema cache of memory;

  • The following part is the process of executing query, during which the meta-information of the schema cache in memory is almost read.

picture

The GV$OB_SERVER_SCHEMA_INFO in the question raised by the customer in the community at the beginning can be understood as the latest version of the schema information that has been refreshed by each ObServer and each tenant. The schema information that users are more concerned about in this view is REFRESHED_SCHEMA_VERSION, SCHEMA_COUNT, and SCHEMA_SIZE, among which The meaning is as follows:

  • REFRESHED_SCHEMA_VERSION: The schema version that the corresponding tenant has refreshed to on the corresponding machine.

  • RECEIVED_SCHEMA_VERSION: The schema version of the latest refresh task sent by RS that the corresponding tenant has received on the corresponding machine.

  • SCHEMA_COUNT: The total number of objects in each schema under the corresponding schema version (number of tables + number of databases +...).

  • SCHEMA_SIZE: The total memory size (B) occupied by each schema object under the corresponding schema version.

obclient> select * from oceanbase.GV$OB_SERVER_SCHEMA_INFO\G*************************** 1. row ***************************                    SVR_IP: 11.158.31.20                  SVR_PORT: 22602                 TENANT_ID: 1002  REFRESHED_SCHEMA_VERSION: 1690109029768968   RECEIVED_SCHEMA_VERSION: 1690113309637344              SCHEMA_COUNT: 1583               SCHEMA_SIZE: 1537240MIN_SSTABLE_SCHEMA_VERSION: -11 row in set (0.01 sec)

Troubleshooting methods for DDL and schema

Now that we have said so much, let’s talk about some of the more common problems with DDL and schema. This section welcomes everyone to add better troubleshooting methods.

A syntax error is reported when executing DDL. How should I change the syntax?

Customers often try to transfer the metadata of the database they are using to the OceanBase open source version. For example, a few days ago I saw a customer who wanted to put the partition table definition in pg into the tenant of OceanBase MySQL mode for execution. But if an error is reported, it will be considered that OceanBase does not support partition tables.

 
 

CREATE TABLE value_stream_dashboard_counts ( id bigint NOT NULL, namespace_id bigint NOT NULL, count bigint NOT NULL, metric smallint NOT NULL)PARTITION BY RANGE (id);

picture

How should we check the corresponding syntax in OceanBase MySQL mode when we encounter this kind of problem? You may generally check various OceanBase syntax documents, but OceanBase syntax is changing rapidly with the gradual improvement of compatibility. There is no way to guarantee that the content of the document is strongly consistent with the actually supported syntax (even the final consistency cannot be guaranteed) . I think of what my senior brother said to me: "Documents love to lie, but code never lies." All the syntax supported by OceanBase Community Edition is actually written in a yacc file called sql_parser_mysql_mode.y.

After reading the syntax rules in this file, we can easily change the above SQL into a SQL that can be executed successfully in OceanBase MySQL mode.

​​​​​​​

 
 

CREATE TABLE value_stream_dashboard_counts ( id bigint NOT NULL, namespace_id bigint NOT NULL, count bigint NOT NULL, metric smallint NOT NULL)PARTITION BY RANGE (id)( PARTITION p0 VALUES LESS THAN (100), PARTITION p1 VALUES LESS THAN (200), PARTITION p2 VALUES LESS THAN (300), PARTITION p3 VALUES LESS THAN MAXVALUE);

An unclear error is reported when executing DDL. How should I troubleshoot the cause of the failure?

For example, I executed a DDL and it reported an error. The error said that my check constraint contained an expression that was not allowed to be included in the check constraint, but what expression specifically was not allowed? Is it c1, is =, is it sysdate(), or is c1 = sysdate()?

​​​​​​​

 
 

obclient> create table t1(c1 int, check (c1 = sysdate()));ERROR 3814 (HY000): An expression of a check constraint contains disallowed function.

First check the trace_id of the error statement.

​​​​​​​

 
 

select last_trace_id();+------------------------------------+| last_trace_id() |+------------------------------------+| Y584A0B9E1F14-00060127094761A8-0-0 |+------------------------------------+1 row in set (0.00 sec)

Then we can retrieve the observer's log by grep Y584A0B9E1F14-00060127094761B0-0-0 observer.log*.

picture

The first warning log corresponding to this trace says: deterministic expr is wrongly specified in CHECK constraint (this log is actually written incorrectly, the original meaning should be not deterministic expr is wrongly specified in CHECK constraint), which probably means that there are A (non-)deterministic expression is not allowed.

So what expressions are non-deterministic expressions? This requires taking a look at the code based on the file and line number ob_raw_expr_util.cpp:1856 given in the log. You can jump directly to the definition of a specific function on the web page, such as ObRawExpr::is_non_pure_sys_func_expr.

All expressions that are not deterministic are listed here, including the sysdate we used.

picture

So we can roughly know that the expression in the check constraint needs to ensure that the same result can be obtained multiple times. An expression like sysdate that outputs the current time is executed multiple times at different times, and the results must be different, so it is not allowed to appear in the check constraint. Here we can also take the opportunity to understand which other expressions are not deterministic.

What should I do if I can’t find any useful logs when executing DDL?

For example, I executed a DDL to create a database, and an error was reported.

​​​​​​​

 
 

obclient> create database xiaofeng_db;ERROR 4016 (HY000): Internal error
obclient> select last_trace_id();+------------------------------------+| last_trace_id() |+------------------------------------+| Y584A0B9E1F14-00060127094761B4-0-0 |+------------------------------------+1 row in set (0.00 sec)

Take the trace id to retrieve the log, grep Y584A0B9E1F14-00060127094761B4-0-0 observer.log*, the result is rpc error.

picture

Recall the DDL execution process just mentioned. The DDL arg will be sent to the RS for execution, so in this case there is a high probability that something went wrong during execution on the RS, so we also need to grep Y584A0B9E1F14-00060127094761B4-0-0 rootservice.log* | vi - Continue to grep the following RS logs, and then search for the earliest occurrence of ret=-4016 in the log file based on the error code -4016.

picture

Then we can find that the error reported in the log is in line 2887 of the ob_root_service.cpp file. The reason for the error is: create_database failed, because db_name is forbidden. For this kind of problem, you should first briefly analyze the cause based on the files and line numbers in the error log. If you still have no clue, ask OceanBase technical support students to assist in the analysis.

Looking through this file, oh, it turns out that I deliberately added an error code here in order to create a scenario where an error is reported in RS, saying that as long as the database_name of create database is called xiaofeng_db, an error of 4016 OB_ERR_UNEXPECTED will be reported.

picture

It is very common to ignore the rootservice.log log when troubleshooting DDL and schema problems. I have seen many very experienced OceanBase kernel development experts waste a lot of time troubleshooting a simple bug more than once because of this problem. Please remember that if there is no clue about this kind of problem in the observer.log, you should also check the rootservice.log.

What should I do if the schema hang hangs?

The reason for refreshing the schema hang is that some schema validity checks are performed during the process of loading the data of the internal table into the memory. If the verification fails, it means that there is a problem with the metadata information persisted in the internal table. At this time, the observer will hang and cannot do anything. Because once the metadata is wrong, whether DDL, DML, or query is executed based on the wrong metadata, errors will be added to errors, and a large number of correctness problems will easily occur. The probability of this happening is extremely low, but the problem is very serious.

If the execution of DDL hangs, and messages similar to "Trying so hard to die" and "schema rebuild meta is still not consistent after, need fixing" appear in the RS log, it indicates that the recovery environment requires manual access to modify OceanBase Error messages in internal tables are of high risk. It is recommended to contact OceanBase technical support students in time to help troubleshoot the root cause of the problem and assist you in restoring the environment.

Guess you like

Origin blog.csdn.net/OceanBaseGFBK/article/details/132672236