ClickHouse Configuration File Instructions

This article mainly introduces the configuration files of ClickHouse. The configuration in ClickHouse is mainly divided into two categories, one is responsible for server-side configuration, and the other is responsible for user-side configuration. Those responsible for server-side configuration are generally placed config.xmlin the file, and those responsible for user-side configuration are generally placed users.xmlin the file. config.xmlOf course, it is also possible to put both in the file, but according to the custom, the two will be divided into two files for configuration. config.xmlTherefore, we will not separate and introduce the following functions in the future users.xml, they are all universal.

Multiple Profiles Capabilities

In order to have greater flexibility in configuration files, ClickHouse also supports the multi-configuration file function, that is, the configuration responsible for different functions can be divided into multiple configuration files. The content is merged as the final configuration. The advantage of this is that the configuration can be classified and managed. For example, the configuration related to the ClickHouse cluster topology can be independently formed as a configuration file (usually named clusters.xml), and the macro-related configuration can also be independently formed as a configuration file (usually named macros.xml).

If you want to use the multi-configuration file function, you need to understand how ClickHouse loads configuration files. The default main configuration path of ClickHouse is /etc/clickhouse-server/config.xml(of course, you can use to specify the path of the configuration file when ClickHouse Server is started --config-file=/etc/config/config.xml), if there is config.xmla directory in the folder where is located config.d, ClickHouse will traverse all the files in this directory and save the content to Merge to generate the final configuration content. The above steps will be executed every time ClickHouse is restarted.

How does ClickHouse deal with setting different values ​​for the same configuration? Let's take macroas an example to see how ClickHouse handles it. The experiment is divided into two situations, one is config.xmlrepeated configuration marco, and the other is repeated configuration with different configuration files marco.

Single file duplicate configuration

We config.xmladd the following configuration in , and then start ClickHouse Server to check athe value of .

<clickhouse>
    ......
    <macros>
        <a>1</a>
    </macros>
    <macros>
        <a>2</a>
    </macros>
</clickhouse>

The value queried by the following statement ais 1.

SELECT * FROM system.macros

┌─macro─┬─substitution─┐
│ a     │ 1
└───────┴──────────────┘

This means that in the same configuration file, if the same configuration parameter is configured with different values, ClickHouse will use the value that appears first.

Multi-file duplicate configuration

config.dConfigure the two files a.xmland in the directory respectively b.xml, and configure both macro.

a.xml:
<clickhouse>
    <macros>
        <a>1</a>
    </macros>
</clickhouse>

b.xml:
<clickhouse>
    <macros>
        <a>2</a>
    </macros>
</clickhouse>

Also, the value queried by the above statement ais 2.

SELECT * FROM system.macros

┌─macro─┬─substitution─┐
│ a     │ 2
└───────┴──────────────┘

Why does it take the value 2 instead of 1 this time? We can observe the logs of ClickHouse Server:

2023.01.07 11:26:51.092322 [ 25669095 ] {
    
    } <Debug> ConfigReloader: Loading config 'config.xml'
Processing configuration file 'config.xml'.
Merging configuration file 'config.d/a.xml'.
Merging configuration file 'config.d/b.xml'.

ClickHouse will load first config.xml, and then traverse config.dthe directory to load all configuration files in alphabetical order of file names. If different files contain the same configuration parameters, the later loaded ones will overwrite the previous parameter values. If you a.xmlchange to c.xml, the final execution SELECT * FROM system.macrosresult will be 1, you can try it yourself.

summary

Although we understand the behavior of ClickHouse loading configuration files, we still try to avoid the problem of repeated configuration.

Configuration replacement function

ClickHouse supports replacing configuration values ​​with environment variables, xml stanzas, and zookeeper node values.

Use environment variable substitution

ClickHouse supports using in the xml section from_env="xxx"to use environment variables to replace the current configuration values. The usage is as follows:

<clickhouse>
    <macros>
        <replica from_env="REPLICA" />
    </macros>
</clickhouse>

Environment variables can export REPLICA=0be specified by , and the query result is 0 by SELECT * FROM system.macros WHERE macro = 'replica'. Equivalent to configuration <replica>0</replica>.

Replace with xml section

ClickHouse supports using in the xml section incl="xxx"to specify an xml section to replace the current xml section. The usage is as follows:

<clickhouse>
    <zookeeper incl="zookeeper-servers" optional="true">
        <node>
            <host>host1</host>
            <port>2181</port>
        </node>
    </zookeeper>  
</clickhouse>

<clickhouse>
    <zookeeper-servers>
        <node>
            <host>host2</host>
            <port>2182</port>
        </node>
    </zookeeper-servers>
</clickhouse>

This <zookeeper>will be <zookeeper-servers>replaced by the content contained in the . Equivalent to the following configuration:

<clickhouse>
    <zookeeper>
        <node>
            <host>host2</host>
            <port>2182</port>
        </node>
    </zookeeper>
</clickhouse>

optional="true"This attribute is to avoid the problem that an error will be reported if the specified xml section does not exist. If yes true, and <zookeeper-servers>does not exist, <zookeeper>the configuration with host1 and port 2181 will be used. In general, it is not recommended to configure optional="true". It involves key configuration information. If there is an error, it should be reported in advance to avoid using the wrong configuration.

Replace with zookeeper node value

ClickHouse supports using in the xml section from_zk="xxx"to specify a zookeeper node value (requires xml section form) to replace the current xml section. The usage is as follows:

<clickhouse>
    <remote_servers from_zk="/clickhouse/remote_servers">
        <default>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>host1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>host2</host>
                    <port>9000</port>
                </replica>
            </shard>
        </default>
    </remote_servers>
</clickhouse>

<!-- zookeeper 节点/clickhouse/remote_servers内容如下 -->
        <default>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>host3</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>host4</host>
                    <port>9000</port>
                </replica>
            </shard>
        </default>

In this way, the cluster configuration named default will be /clickhouse/remote_serversreplaced by the value of the node on zookeeper. Please pay attention to the tightening in the zookeeper node, which will be brought into the configuration. If you want to consider it, you can keep the tightening in the zookeeper node.

Configuration supports yaml format

ClickHouse also supports configuration files in yaml format. For specific examples, please refer to config.yaml.example . And ClickHouse also supports the mixed use of yaml and xml, but you cannot use yaml and xml in one file at the same time. This section does not introduce this part too much. from_env="xxx"Because yaml is not very intuitive and easy to understand when expressing the attributes of the xml section (such as ), and the general production environment still uses the xml format as the configuration file by default, so this part is enough to understand and is not recommended.


Welcome to add WeChat: xiedeyantu to discuss technical issues.

Guess you like

Origin blog.csdn.net/weixin_39992480/article/details/129208658