Clickhouse 集群zookeeper数据丢失恢复

最近遇到测试服务器的数据库遇到断电,导致zookeeper数据丢失,从而使clickhouse数据库无法启动。错误信息为:

Cannot create table from metadata file /var/lib/clickhouse/metadata/xx/info_xxx.sql, error: Coordination::Exception: Can’t get data for node /clickhouse/tables/xx/cluster_xxx-01/info_xxxx/metadata: node doesn’t exist (No node), stack trace:
0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x7108056]
1. /usr/bin/clickhouse-server(Coordination::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int, int)+0x31) [0x6c20381]
2. /usr/bin/clickhouse-server(Coordination::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int)+0x1a5) [0x6c21135]

经过一番排查之后,最终只有求助官方。

Create a MergeTree table with a different name. Move all the data from the directory with the ReplicatedMergeTree table data to the new table’s data directory. Then delete the ReplicatedMergeTree table and restart the server.

If you want to get rid of a ReplicatedMergeTree table without launching the server:

  • Delete the corresponding .sql file in the metadata directory (/var/lib/clickhouse/metadata/).
  • Delete the corresponding path in ZooKeeper (/path_to_table/replica_name).

After this, you can launch the server, create a MergeTree table, move the data to its directory, and then restart the server.

大致操作步骤如下(数据库关闭状态):

1. 将/var/lib/clickhouse/metadata/ 下的SQL备份之后删除
2. 将/var/lib/clickhouse/data/ 下的备份之后删除
3. 启动数据库
4. 创建同数据结构的MergeTree表
5. 将之前分布式表的数据文件夹复制到新表(MergeTree)的数据目录中。
6. 重启数据库
7. 重新创建原结构本地表
8. 重新创建原结构分布式表
9. insert into [分布式表] select * from [MergeTree表]

发表评论

电子邮件地址不会被公开。 必填项已用*标注