[Postgresql Basics Introduction] Create a new database service cluster directory and customize your own exclusive data area

Initialize the cluster

​Column content :

​Open source contributions :

Personal homepage : My homepage
Management community : Open source database
Motto: Tian Xingjian, a gentleman strives for self-improvement; Geography Kun, a gentleman carries virtue.

series of articles

I. Introduction

The postgresql database is a general-purpose relational data, comparable to commercial data in open source databases, and is becoming more and more popular in the industry.

Because it is an open source database, it not only discloses the source code, but also has many use cases and useful plug-ins, so it has gradually become the pioneer and standard of the database. Through postgresql, it can be well understood from use to principle;

If you are learning programming, you can also learn a wealth of programming knowledge, data structures, and programming skills. There are also many exquisite architecture designs, layered ideas, and ideas that can be flexibly customized.

This column mainly introduces the entry and use of postgresql, database maintenance and management. Through these uses, you can understand the principles of databases, and gradually understand what kind of database postgresql is, what things you can do, and how to provide good services. The most important thing is that these knowledge are interviews A must-have item for .

2. Overview

This article mainly introduces the postgresql database cluster, its initialization, configuration, and its physical structure.

After compiling and installing the source code, the first step is to initialize the cluster. After installing through the installation package, a default database cluster has been initialized. After starting the service, you can directly perform SQL command operations.

When we deploy applications, different applications often use different database cluster directories to physically isolate them, so we need to initialize their own database clusters for different applications.

3. Principle

We know that the database is mainly to help store and manage data, which is convenient for us to retrieve. Then the directory where the database stores data is called the cluster directory in postgresql, and some translations are called cluster directories.

Under the cluster directory, user data is stored, such as created tables, users, indexes and other data, as well as database organizational data, such as the mapping relationship between databases and tables, the number of databases, and the structure definition of tables, etc., which we call data Dictionaries;
In addition, there is another type of data that is for the stable operation of database services and high-performance auxiliary data, such as the cache data of the system dictionary, the management data of free space, and so on.

When the database service is running, it will be loaded from this directory, so this directory is very important. In actual projects, there will be dedicated system users to operate it. At the same time, it must be cold-standby and various strategies must be implemented. hot standby.

Uh, I leaked the original way of deleting the library and running away, and I didn’t say anything. . .

3. Command introduction

In the bin directory under the installation directory, there is the command initdb to initialize the database cluster. Let's take a look at its introduction.

# 安装后的目录大概有以下四个子目录, 
# bin 命令目录 include 是开发所需的头文件,
# lib是开发所需的库,share中包括模版配置文件,插件等 
[senllang@hatch postgres]$ ll
total 16
drwxr-xr-x. 2 senllang develops 4096 Aug  2 09:26 bin
drwxr-xr-x. 6 senllang develops 4096 Aug  2 09:24 include
drwxr-xr-x. 4 senllang develops 4096 Aug  2 09:26 lib
drwxr-xr-x. 7 senllang develops 4096 Aug  2 09:26 share
[senllang@hatch postgres]$ cd bin/

view help

[senllang@hatch bin]$ ./initdb --help
initdb initializes a PostgreSQL database cluster.

Usage:
  initdb [OPTION]... [DATADIR]

Options:
  -A, --auth=METHOD         default authentication method for local connections
      --auth-host=METHOD    default authentication method for local TCP/IP connections
      --auth-local=METHOD   default authentication method for local-socket connections
 [-D, --pgdata=]DATADIR     location for this database cluster
  -E, --encoding=ENCODING   set default encoding for new databases
  -g, --allow-group-access  allow group read/execute on data directory
      --icu-locale=LOCALE   set ICU locale ID for new databases
      --icu-rules=RULES     set additional ICU collation rules for new databases
  -k, --data-checksums      use data page checksums
      --locale=LOCALE       set default locale for new databases
      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE
      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE
                            set default locale in the respective category for
                            new databases (default taken from environment)
      --no-locale           equivalent to --locale=C
      --locale-provider={
    
    libc|icu}
                            set default locale provider for new databases
      --pwfile=FILE         read password for the new superuser from file
  -T, --text-search-config=CFG
                            default text search configuration
  -U, --username=NAME       database superuser name
  -W, --pwprompt            prompt for a password for the new superuser
  -X, --waldir=WALDIR       location for the write-ahead log directory
      --wal-segsize=SIZE    size of WAL segments, in megabytes

Less commonly used options:
  -c, --set NAME=VALUE      override default setting for server parameter
  -d, --debug               generate lots of debugging output
      --discard-caches      set debug_discard_caches=1
  -L DIRECTORY              where to find the input files
  -n, --no-clean            do not clean up after errors
  -N, --no-sync             do not wait for changes to be written safely to disk
      --no-instructions     do not print instructions for next steps
  -s, --show                show internal settings
  -S, --sync-only           only sync database files to disk, then exit

Other options:
  -V, --version             output version information, then exit
  -?, --help                show this help, then exit

If the data directory is not specified, the environment variable PGDATA
is used.

Report bugs to <[email protected]>.
PostgreSQL home page: <https://www.postgresql.org/>

Parameter Description

The commonly used parameters are mainly

  • -D Specifies the path of the cluster directory, which is a mandatory parameter; for example, ~/testdemo, then testdemo is the cluster directory, of course, you can also set the environment variable PGDATA instead of -D
  • -U specifies the super administrator user name of the database, which has all permissions, which is an optional parameter; if not specified, the default database super administrator has the same name as the current system user name;
  • -W prompts for a password, which is the login password of the super administrator user; this is also an optional parameter, and the default is empty;

Other parameters can be learned in subsequent studies.

4. Initialize the cluster

Now that we understand the command to initialize a cluster, let's initialize a database cluster of our own.

In the current directory, create the database cluster directory of testdemo, and name the database superuser postgres

[senllang@hatch bin]$ ./initdb -D testdemo -W -U postgres
The files belonging to this database system will be owned by user "senllang".
This user must also own the server process.

Using default ICU locale "en_US".
Using language tag "en-US" for ICU locale "en_US".
The database cluster will be initialized with this locale configuration:
  provider:    icu
  ICU locale:  en-US
  LC_COLLATE:  en_US.UTF-8
  LC_CTYPE:    en_US.UTF-8
  LC_MESSAGES: en_US.UTF-8
  LC_MONETARY: en_US.UTF-8
  LC_NUMERIC:  en_US.UTF-8
  LC_TIME:     en_US.UTF-8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

Enter new superuser password:
Enter it again:

creating directory testdemo ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D testdemo -l logfile start

In this way, a testdemo directory is generated in the current directory, which contains many subdirectories and files. Let's get to know it below.

5. Cluster directory

[senllang@hatch bin]$ cd testdemo/
[senllang@hatch testdemo]$ ll
total 56
drwx------. 5 senllang develops    33 Sep  2 14:42 base
drwx------. 2 senllang develops  4096 Sep  2 14:42 global
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_commit_ts
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_dynshmem
-rw-------. 1 senllang develops  5711 Sep  2 14:42 pg_hba.conf
-rw-------. 1 senllang develops  2640 Sep  2 14:42 pg_ident.conf
drwx------. 4 senllang develops    68 Sep  2 14:42 pg_logical
drwx------. 4 senllang develops    36 Sep  2 14:42 pg_multixact
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_notify
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_replslot
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_serial
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_snapshots
drwx------. 2 senllang develops    25 Sep  2 14:42 pg_stat
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_stat_tmp
drwx------. 2 senllang develops    18 Sep  2 14:42 pg_subtrans
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_tblspc
drwx------. 2 senllang develops     6 Sep  2 14:42 pg_twophase
-rw-------. 1 senllang develops     3 Sep  2 14:42 PG_VERSION
drwx------. 3 senllang develops    60 Sep  2 14:42 pg_wal
drwx------. 2 senllang develops    18 Sep  2 14:42 pg_xact
-rw-------. 1 senllang develops    88 Sep  2 14:42 postgresql.auto.conf
-rw-------. 1 senllang develops 29708 Sep  2 14:42 postgresql.conf

First look at a few important directories and files

  • The base directory stores the data in each database, including table, index, sequence, view, etc.;
  • The pg_wal directory stores the redo logs we often say
  • The log directory database service operation log, not in the above figure, the switch to output the log to the file is not enabled by default
  • postgresql.conf The configuration file of the database service, such as a large number of configurations such as the listening IP and port of the database service
  • pg_hba.conf is the configuration file for host access control. By default, only the local machine can log in. If remote access is required, configuration permission is required;

We will slowly understand other documents later;

6. Configuration file

We will focus on database configuration files and host access control configuration. First, we will introduce a few entry-level parameters, so that we can solve the problems at the beginning of use. Other parameters will be understood along with other function introductions.

database configuration

The default parameters will be preceded by #. If you need to modify it, you need to remove the preceding # to make the modification effective.

Monitor configuration

The default localhost and 5432, if you want remote client access, you can modify it to a specific IP or *

listen_addresses = '*'
# port = 5432

Number of database connections

When the database service is started, memory allocation and lock allocation are related to the number of connections. Of course, it cannot be infinite. The default number of connections is 100. Here you can modify it according to your own needs. The larger the number, the more memory and system resources are occupied, which is enough can;

max_connections = 100

Database cache size

When selecting or inserting data, it does not directly operate the files on the disk, but operates the data in the cache. If the data is not in the cache, it will interact with the disk, so of course the bigger the cache, the better. If all the data is loaded, the performance will improve.

The default is 128MB. Of course, it depends on the amount of data and the memory of the machine. If you just try it, the default is enough. If you find that the IO of the disk is high, you need to increase it appropriately;

shared_buffers = 128MB 

Database operation log

Anyone who has debugged the program knows that the importance of the running log is closed by default, that is, it is not output to the file. It is recommended to open it, so that when an exception occurs, it can be analyzed by running the log file;

logging_collector = on 

host access control

The database has strict access control, which can be accurate to a certain database, user, or network segment. Its configuration format is as follows

# local         DATABASE  USER  METHOD  [OPTIONS]
# host          DATABASE  USER  ADDRESS  METHOD  [OPTIONS]
# hostssl       DATABASE  USER  ADDRESS  METHOD  [OPTIONS]
# hostnossl     DATABASE  USER  ADDRESS  METHOD  [OPTIONS]
# hostgssenc    DATABASE  USER  ADDRESS  METHOD  [OPTIONS]
# hostnogssenc  DATABASE  USER  ADDRESS  METHOD  [OPTIONS]

We think of configuring remote access, such as graphical client access, you can add a line to allow all IPs to access all databases, and password verification must be used.

host    all             all             0.0.0.0/0            md5

7. Summary

Through this article, I share the concept of postgresql data storage directory-data cluster directory, initialize the database cluster, and some entry-level parameter configurations. I look forward to your feedback and participation.

end

Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and bookmark, I will work harder!

Author email: [email protected]
If there are any mistakes or omissions, please point them out and learn from each other.

Note: Do not reprint without consent!

Guess you like

Origin blog.csdn.net/senllang/article/details/132638951