高效实施ClickHouse数据备份

本文介绍ClickHouse数据备份,包括手动实施,利用第三方工具(clickhouse-backup)实现。主要包括创建用户并授权,手动备份、自动备份。

创建用户

在进行备份之前,先确认用户是否拥有相应权限。下面首先看如何创建只读用户:

ClickHouse的系统设置策略(settings profile)是一组设置的集合。缺省提供了readonly策略用于查询,下面是/etc/clickhouse-server/users.xml中定义readonly的内容:

<yandex>
    <!-- Profiles of settings. -->
    <profiles>
        ...
        <!-- Profile that allows only read queries. -->
        <readonly>
            <readonly>1</readonly>
        </readonly>
    </profiles>
...

所以创建只读用户bkuser01,仅需要执行下面SQL:

CREATE USER bkuser01 IDENTIFIED WITH plaintext_password BY 'password0' SETTINGS PROFILE 'readonly';
GRANT SHOW TABLES, SELECT ON database.* TO bkuser01;
// GRANT ALL ON backup.* TO bkuser01 ; 全部权限

使用default用执行上述SQL ,可能会提供没有权限错误。因为默认default用户没有访问管理权限。

增加default管理权限

修改 /etc/clickhouse-server/users.xml , 去掉 default的访问权限注释:

<access_management>1</access_management>

无需重启ClickHouse服务,再次执行上面授权SQL,应该提示成功。后面自动备份示例需要用户拥有特定数据库全部权限。

手动实现

ClickHouse内置通过ALTER TABLE ... FREEZE特性支持瞬时备份功能。该方法在shadow目录为所有分区创建硬链接,我们知道硬链接只是增加原文件的别名,从而避免双倍占用空间。

执行手动备份之后,需要移动到其他位置,为了以后可以进行恢复:

  • 从其他位置恢复备份
  • 为每个表拷贝分区备份至detached目录
  • 使用ALTER TABLE ... ATTACH PARTITION语句恢复每个分区

自动备份

利用第三方工具clickhouse-backup实现自动备份。假设您已经安装了该工具并配置了远程存储。它提供了一些命令,可以轻松实现数据备份,官方文档内容参考:

NAME:
clickhouse-backup - Tool for easy backup of ClickHouse with cloud support

USAGE:
clickhouse-backup [-t, --tables=.

] <backup_name>

VERSION:
2.2.0

DESCRIPTION:
Run as ‘root’ or ‘clickhouse’ user

COMMANDS:
tables List list of tables, exclude skip_tables
create Create new backup
create_remote Create and upload
upload Upload backup to remote storage
list List list of backups
download Download backup from remote storage
restore Create schema and restore data from backup
restore_remote Download and restore
delete Delete specific backup
default-config List default config
print-config List current config
clean Remove data in ‘shadow’ folder from all path folders available from system.disks
clean_remote_broken Remove all broken remote backups
watch Run infinite loop which create full + incremental backup sequence to allow efficient backup sequences
server Run API server
help, h Shows a list of commands or help for one command

GLOBAL OPTIONS:
–config FILE, -c FILE Config FILE name. (default: “/etc/clickhouse-backup/config.yml”) [$CLICKHOUSE_BACKUP_CONFIG]
–help, -h show help
–version, -v print the version

为了演示,创建backup数据库、test表:

:) CREATE DATABASE IF NOT EXISTS backup

:) CREATE TABLE IF NOT EXISTS backup.test
(
  `date` Datetime,
  `user_id` String,
  `pageviews` Int32
)
ENGINE = MergeTree()
PARTITION BY toStartOfHour(date)
ORDER BY (date)

插入示例数据:

INSERT INTO backup.test VALUES
('2021-01-13 07:00:00', 'user 1', 7),
('2021-01-13 08:00:00', 'user 2', 3),
('2021-01-13 09:00:00', 'user 3', 1),
('2021-01-13 10:00:00', 'user 4', 12)

利用clickhouse-backup工具查看备份情况:

# clickhouse-backup list
Local backups:
no backups found
Remote backups:
no backups found

查看数据表:

# clickhouse-backup tables
backup.test

创建本地备份:

# clickhouse-backup create
2023/03/23 13:14:43 Create backup '2023-03-23T13-12-43'
2023/03/23 13:14:43 Freeze 'backup.test'
2023/03/23 13:14:43 Copy part hashes
2023/03/23 13:14:43 Writing part hashes
2023/03/23 13:14:43 Copy metadata
2023/03/23 13:14:43   Done.
2023/03/23 13:14:43 Move shadow
2023/03/23 13:14:43   Done.

# clickhouse-backup list
Local backups:
- '2023/03/23T13-12-43' (created at 23-03-2023 13:12:43)
Remote backups:
no backups found

更多内容参考:https://dev.to/jv/backup-and-restore-with-clickhouse-backup-dn9

clickhouse-backup地址:https://github.com/AlexAkulov/clickhouse-backup

猜你喜欢

转载自blog.csdn.net/neweastsun/article/details/129743245
今日推荐