Greenplum's system tables

Like all other relational databases, Greenplum has a set of metadata tables that manage the internal objects and relationships in the database, which we call Greenplum system tables. Greenplum's product kernel is developed based on the postgresql database. Therefore, many Greenplum system tables are inherited from the postgresql database.

Greenplum's system tables can be roughly divided into the following categories:

 

 

   1. Metadata of internal objects in the database, such as: pg_database, pg_namespace, pg_class, pg_attribute, pg_type, pg_exttable, etc.

 

Such system tables cover both global object definitions and various object definitions within each database. The metadata of such system tables is not stored in a distributed manner, but each database instance (whether it is a master instance or a segment instance) has a complete metadata. But there are exceptions, such as: gp_distribution_policy (distribution key definition) table only has metadata on the master.

For such system tables, it is important that the metadata be consistent across instances.

 

 

2.  Maintain metadata of Greenplum cluster status, such as: gp_segment_configuration, gp_configuration_history, pg_stat_replication, etc.

 

This type of system table is mainly maintained by the master instance, just as the data of the two tables gp_segment_configuration and gp_configuration_history of the segment instance state management are maintained by the dedicated process fts of the master.

 

 

 

3. Persistent table,如:gp_persistent_database_node、gp_persistent_filespace_node、gp_persistent_relation_node、gp_persistent_tablespace_node。

 

Such system tables also exist in every database instance. Within each instance, persistenttable has strict primary and foreign key relationships with system tables such as pg_class/pg_relation_node/pg_database. This type of system table is also an important reference data for synchronization between the primary instance and the mirror instance.

 

When the Greenplum cluster fails, there may be problems with the system table data. Problems with system tables can cause many kinds of failures, such as: some database objects are unavailable, instance recovery is unsuccessful, instance startup is unsuccessful, and so on. For problems related to system tables, we should combine the log information of each instance and the inspection results of the system tables to locate the problem. This article will introduce some methods and techniques for locating, analyzing, and solving problems.

 

Check tool

 

Greenplum provides a system table checking tool, gpcheckcat. The tool is in the $GPHOME/bin/lib directory. The tool must check when Greenplum Database is idle to be most accurate. If a large number of tasks are running, the inspection results will be disturbed, which is not conducive to locating the problem. Therefore, it is recommended to start the database in restricted mode before using gpcheckcat to ensure that no other application tasks interfere.

 

 

Analytical Methods and Processing Skills

1. When encountering the problem of temporary schema, it is named pg_temp_XXXXX, which can be deleted directly. After checking by gpcheckcat, a repair script for the temporary schema is automatically generated. Since the problem of the temporary schema will interfere with the check result, you need to use gpcheckcat to check again after processing.

 

2. If the metadata of individual table objects is inconsistent, it usually only affects the use of the object, not the entire cluster. If there is only a problem in an individual instance, you can connect to the problem instance through the Utility mode to handle it. The processing principle is to try not to directly change the data of the system table, but to solve it by means of the database, such as: drop table/alter table, etc.

 

3. Persistent table problem, this kind of problem is often more difficult and the impact is relatively large. According to the inspection results of gpcheckcat, all problems other than the persistent table must be fixed before dealing with the persistent table problem.

 

For the persistent table, we will expand on the processing skills of several problems:

 

(1) Similar information appears in the log of the segment instance that reported the error

 

"FATAL","XX000","Number  of freeTIDs 191, do not match maximum free order number 258, for  'gp_persistent_relation_node'

This error may cause instance startup failure, database instance recovery failure, etc. First, set the parameter gp_persistent_skip_free_list=true in the instance of the problem (postgresql.conf). Let the instance in question start first. Perform gpcheckcat again. Similar problems should be found in the results of gpcheckcat:

 

[INFO]:- [FAIL]  gp_persistent_relation_node   <=>  gp_relation_node

[ERROR]:-gp_persistent_relation_node   <=> gp_relation_node found 442  issue(s)

……

[ERROR]:-sdw5:40000:/data1/primary/gpseg24

[ERROR]:-  relfilenode | ctid | persistent_tid

[ERROR]:-  13795767 | (205,18) | None

[ERROR]:-  13795768 | (205,17) | None

[ERROR]:-  13795769 | (7444,134) | None

[ERROR]:-  13799293 | (89,226) | None

……

[INFO]:-[FAIL]  gp_persistent_relation_node   <=>  pg_class

[ERROR]:-gp_persistent_relation_node   <=> pg_class found 442 issue(s)

……

[ERROR]:-sdw5:40000:/data1/primary/gpseg24

[ERROR]:-  relfilenode | nspname | relname | relkind |  relstorage

[ERROR]:-  13795741 | None | None | None | None

[ERROR]:-  13795741 | None | None | None | None

[ERROR]:-  13795741 | None | None | None | None

[ERROR]:-  13795741 | None | None | None | None

……

从上述检查结果可以看出persistent table的部分数据和其他系统表对应关系不正确。处理方法就是要修复persistent table数据。

 

(2)报错的实例日志中出现类似信息

 

FATAL:Global  sequence number 1131954 less than maximum value 1131958 found in scan  ('gp_persistent_relation_node')

该问题可能会导致实例启动失败。可在问题的实例(postgresql.conf)中设置参数gp_persistent_repair_global_sequence=true,便可修复相应问题,让相应实例正常启动。

 

(3)报错的实例日志中出现类似信息

 

Persistent  1663/17226/21248339, segment file #1, current new EOF is greater than update  new

EOF for  Append-Only mirror resync EOFs recovery (current new EOF 1419080, update new  EOF 1416416), persistent serial num 5725616 at TID (3353,1)

该问题会出现在AO表中,表示个别实例上的数据文件被损坏。问题可能会导致进程PANIC,实例启动失败。首先可在问题的实例(postgresql.conf)中设置参数gp_crash_recovery_suppress_ao_eof=true。让出问题的实例先启动起来。再进行gpcheckcat检查。确定问题所在并修复。而通常出问题的AO表已经损坏,建议rename或者删除。

 

(4)在gpcheckcat的检查结果中如果出现如下信息

 

[INFO]:-[FAIL]  gp_persistent_relation_node   <=>  filesystem

[ERROR]:-gp_persistent_relation_node   <=> filesystem found 52 issue(s)

……

[ERROR]:-sdw1:40000:/data1/primary/gpseg0

[ERROR]:-  tablespace_oid | database_oid |  relfilenode_oid | segment_file_num | filesystem | persistent | relkind |  relstorage

[ERROR]:-  17001 | 272379 | 121899432 | 0 | t | f |  None | None

[ERROR]:-  17012 | 272379 | 121692973 | 1 | t | f | r  | c

[ERROR]:-  17012 | 272379 | 121693149 | 1 | t | f | r  | c

[ERROR]:-  17012 | 272379 | 121694359 | 1 | t | f | r  | c

……

检查结果表明文件系统中存在部分数据文件在系统表中没有对应的关系,也就是文件系统中有多余的数据文件。这种情况不会影响Greenplum集群的正常运作,可以暂时忽略不处理。

 

 

修复persistent table表的问题,不可手工修改,只能够使用Greenplum提供的修复工具gppersistentrebuild进行修复。工具提供了备份功能,在操作修复之前必须要执行备份操作。然后通过gppersistentrebuild,指定待修复的实例的contentid进行修复。

 

另外,如果primary实例与mirror实例之间是处于changetracking状态。一旦primary实例进行了persistent table的修复操作,primary实例与mirror实例之间只能执行全量恢复操作(gprecoverseg -F)

 

上面所介绍的一些GUC参数,都是在修复系统表过程中临时增加的参数,待集群恢复正常之后,请将所修改过的GUC参数值恢复回原有默认状态。

本文所介绍的系统表的维护和修复技巧,希望可以帮助Greenplum DBA在故障发生时更有效的分析和定位问题。一旦出现故障,仍需第一时间寻求售后服务工程师和专业服务工程师的支持。

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326530055&siteId=291194637