postgresql.conf parameters for postgresql database operation and maintenance

postgresql.conf parameters for postgresql database operation and maintenance

The following records that I use some system parameters in postgresql database operation and maintenance
(constantly updated)

Master-slave

wal_keep_segments

The default value of 0
specifies the minimum number of past log file segments that can be retained in the pg_wal directory when the backup server needs to obtain log segment files for streaming replication. Each segment is usually 16 megabytes. If a backup server connected to the sending server is behind by more than wal_keep_segments segments, the sending server can remove a WAL segment that the backup still needs, in which case the replication connection will be interrupted. The end result is that the downstream connection will eventually fail (however, if WAL archiving is in use, the backup server can recover by obtaining segments from the archive).

Only set the minimum number of file segments reserved in pg_wal; the system may need to reserve more segments for WAL archiving or recovery from a checkpoint. If wal_keep_segments is zero (the default value), more space is available for storing WAL archives or restoring from a checkpoint. If wal_keep_segments is zero (the default), the system will not reserve any extra segments for backup purposes, so the number of old WAL segments available to the backup server is a function of the last checkpoint location and WAL archive status. This parameter can only be set in the postgresql.conf file or on the server command line.

max_wal_size

The default is 1GB

The maximum size that the WAL is allowed to grow to between automatic WAL checkpoints. This is a soft limit. WAL size may exceed max_wal_size under special circumstances, such as under heavy load, archive_command failure, or high wal_keep_segments setting. The default is 1 GB. Increasing this parameter may lead to the time required for crash recovery. This parameter can only be set in postgresql.conf or the server command line.

min_wal_size

As long as the WAL disk usage remains below this setting, the old WAL files will always be recycled for future use at the checkpoint, instead of being deleted directly. This can be used to ensure that enough WAL space is reserved to cope with peak WAL usage, such as running large batch processing tasks. The default is 80 MB. This parameter can only be set in postgresql.conf or the server command line.

Log

logging_collector

The default is off. It is recommended to enable the on
parameter to enable the log collector. It is a background process that captures log messages sent to stderr, and it redirects these messages to the log file. This method is usually more useful than logging to syslog, because certain types of messages will not appear in syslog output (a common example is dynamic linker error messages; another example is error messages generated by scripts such as archive_command). This parameter can only be set when the server is started.

You can also log to stderr without using the log collector, and the log messages will only go to the location where the server's stderr is directed. However, that method is only suitable for low log volumes, because it does not provide a way to rotate log files. Also, on some platforms that do not use log collectors, it may cause loss or confusion of log output, because multiple processes will write to the same log file concurrently and will overwrite each other's output.

The log collector is designed to never lose messages. This means that under extremely high load, if the server process tries to send more log messages when the collector is already behind, it will be blocked. On the contrary, syslog tends to drop messages when they cannot be written, which means that in such cases it may not be able to log certain messages, but it does not block other parts of the system.

log_destination

If you need to do audit and analysis, it is recommended to change to csvlog , otherwise, the default is fine .

PostgreSQL supports multiple methods to log server messages, including stderr, csvlog, and syslog . Eventlog is also supported on Windows. Set this parameter to a list of desired log destinations, separated by commas. The default value is to only log to stderr. This parameter can only be set in the postgresql.conf file or on the server command line.
If csvlog is included in log_destination, the log entries will be output in comma separated value (CSV) format, which makes it easy to load the log into the program. To generate log output in CSV format, logging_collector must be enabled.

When stderr or csvlog is included, the file current_logfiles will be created to record the location of the log file currently being used by the log collector and the related log destination. This provides a convenient way to find the logs currently used by the instance. Here is an example of the content of the file:

stderr log/postgresql.logcsvlog log/postgresql.csv

When a new log file is created due to the rotation effect and log_destination is reloaded, the current_logfiles file will be rebuilt. This file will be deleted when stderr and csvlog are not included in log_destination and when the log collector is disabled.

log_statement

Do not audit up to ddl to
control which SQL statements are recorded. The valid values ​​are none (off), ddl, mod, and all (all statements) . DDL records all data definition statements, such as CREATE, ALTER, and DROP statements. mod records all ddl statements, plus data modification statements such as INSERT, UPDATE, DELETE, TRUNCATE, and COPY FROM. If PREPARE, EXECUTE, and EXPLAIN ANALYZE contain appropriate types of commands, they will also be logged. For clients using the extended query protocol, a log will be generated when an execution message is received and the value of the binding parameter will be included (any embedded single quotes will be double-written).

The default value is none. Only super users can change this setting.

Even if log_statement = all is set, statements containing simple syntax errors will not be logged. This is because the log message will only be issued after the basic syntax analysis is completed and the statement type is determined. In the case of the extended query protocol, the error statement before the execution phase (that is, during analysis or planning) will not be recorded. Set log_min_error_statement to ERROR (or lower) to record such statements.

log_min_duration_statement

If your query time exceeds this parameter, it will be recorded in the log

If the statement runs for at least the specified number of milliseconds, it will cause the duration of each such completed statement to be recorded. Setting this parameter to zero will print the execution time of all statements. Set to -1 (the default value) to stop recording the sentence duration. For example, if you set it to 250ms, all SQL statements running 250ms or longer will be recorded. Enabling this parameter can help track unoptimized queries in the application. Only super users can change this setting.

For clients using the extended query protocol, the duration of the resolution, binding, and execution steps will be independently recorded.

When this option is used with log_statement, the statement text that has been recorded by log_statement will not be repeated in the duration log message. If you are not using syslog, we recommend that you use log_line_prefix to record PID or session ID, so that you can use process ID or session ID to link statement messages to later duration messages.

log_lock_waits

Controls whether a log message is generated when a session waits to obtain a lock until the deadlock_timeout is exceeded. This helps to decide whether the waiting is causing the slow performance. The default value is off. Only super users can change this setting.

deadlock_timeout

This is the total time (in milliseconds) to wait on a lock before deadlock detection. Deadlock detection is relatively expensive, so the server will not run it every time it waits for a lock. We optimistically assume that deadlocks do not occur frequently in production applications and only wait for a while before starting to detect deadlocks. Increasing this value reduces the time wasted on useless deadlock detection, but slows down the speed of reporting true deadlock errors. The default is 1 second (1s), which may be the minimum value you want in practice. On a heavily loaded server, you may need to increase it. The ideal setting of this value should exceed your usual transaction time, so that you can reduce the chance of deadlock checking before the lock is released. Only super users can change this setting

log_duration

The duration of each completed sentence is recorded. The default value is off. Only super users can change this setting.

For clients using the extended query protocol, the duration of the resolution, binding, and execution steps will be independently recorded.

The difference between setting this option and setting log_min_duration_statement to zero is that more than log_min_duration_statement forces the query text to be logged, but this option will not. Therefore, if log_duration is on and log_min_duration_statement is a positive value, all durations will be recorded, but only sentences that exceed the threshold will be recorded in the query text. This behavior helps to collect statistics in high load installations

performance

max_worker_processes

I suggest to adjust it to above 200

Set the maximum number of background processes that the system can support. This parameter can only be set when the server is started. The default value is 8.
When running a backup server, you must set this parameter to be equal to or higher than the value on the master server. Otherwise, queries may not be allowed on the backup server.

When changing this value, consider also adjusting max_parallel_workers, max_parallel_maintenance_workers, and max_parallel_workers_per_gather.

max_parallel_workers

Set the maximum number of workers supported by the system for parallel operations. The default value is 8. When increasing or decreasing this value, also consider adjusting max_parallel_maintenance_workers and max_parallel_workers_per_gather. In addition, it should be noted that setting this value larger than max_worker_processes will not have an effect, because parallel worker processes are taken from the worker process pool established by max_worker_processes.

max_parallel_workers_per_gather

Set the maximum number of workers that a single Gather or Gather Merge node can start. Parallel workers will be obtained from the process pool established by max_worker_processes, and the number is limited by max_parallel_workers. Note that the required number of workers may not actually be met at runtime. If this happens, the plan will run with fewer workers than expected, which may be less efficient. The default value is 2. Setting this value to 0 (the default value) will disable parallel query execution.

Note that parallel queries may consume more resources than non-parallel queries, because each worker process is a completely independent process, and its impact on the system is roughly the same as an additional user session. This factor should be taken into consideration when selecting values ​​for this setting and when configuring other settings that control resource utilization (such as work_mem). Resource limits such as work_mem are applied to each worker independently, which means that the total resource utilization of all processes may be much higher than that of a single process. For example, a parallel query using 4 workers may use up to 5 times the CPU time, memory, and I/O bandwidth than when no workers are used.

max_parallel_maintenance_workers

Set the maximum number of parallel workers that can be started by a single tool command. Currently, the only tool command that supports the use of parallel workers is CREATE INDEX, and it can be parallelized only when building a B-tree index. Parallel workers are taken from the process pool created by max_worker_processes, and the number is controlled by max_parallel_workers. Note that the actual number of workers requested at runtime may not be available. If this happens, instrumental operations will be run with fewer workers than expected. The default value is 2. Set this value to 0 to disable the use of tool-based commands for parallel workers.

Note that parallel instrumental commands should not consume more memory than the same number of non-parallel operations. This strategy is different from parallel query, the resource limit of parallel query is usually applied to each worker process. Parallel instrumental commands treat the resource limit maintenance_work_mem as a restriction on the entire instrumental command, regardless of how many parallel worker processes are used. However, parallel instrumental commands may actually still consume more CPU resources and I/O bandwidth.

shared_buffers

Set the amount of shared memory buffer that the database server will use. The default is usually 128 megabytes (128MB), but if your kernel settings don't support it (determined at initdb), it can be less. This setting must be at least 128 kilobytes (the non-default value of BLCKSZ will change the minimum value). However, for better performance, a setting significantly higher than the minimum is usually used.

If you have a dedicated database server with 1GB or more of memory, a reasonable starting value for shared_buffers is 25% of system memory. Even if a larger shared_buffers is effective, it will cause some workload, but because PostgreSQL also relies on the operating system's high-speed buffer, setting shared_buffers to more than 40% RAM is unlikely to work better than a small value. In order to distribute the processing of writing large amounts of new or changed data over a longer period of time, a larger setting of shared_buffers usually requires a corresponding increase in max_wal_size.

If the system memory is less than 1GB, a smaller percentage of RAM is appropriate to leave enough space for the operating system.

Guess you like

Origin blog.csdn.net/yang_z_1/article/details/112554959
Recommended