记一次华为高斯数据库部署事件 关于pg_perf的事件


事件描述

测试用的高斯数据库突然不能用了。因为是docker部署的。上服务器之后就使用docker ps -a|grep gauss然后就发现服务已经Exit了很久了。

检查日志

阅读日志后的核心问题:

2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] FATAL:  ERROR: could not create instr log directory "pg_perf": Permission denied

也就是说,docker在启动高斯后,对于我的宿主机权限不够,无法创建pg_perf这个目录。
检查日志后得到的结果如下:

openGauss Database directory appears to contain a database; Skipping initialization
0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.
0 LOG:  [Alarm Module]Host Name: a2aaf89b6607 
0 LOG:  [Alarm Module]Host IP: a2aaf89b6607. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>
0 LOG:  [Alarm Module]Get ENV GS_CLUSTER_NAME failed!
0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57
0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
2025-04-21 08:41:08.297 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  when starting as multi_standby mode, we couldn't support data replicaton.
2025-04-21 08:41:08.297 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  base_page_saved_interval is 400, ori is 400.
2025-04-21 08:41:08.304 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.
2025-04-21 08:41:08.304 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  [Alarm Module]Host Name: a2aaf89b6607 
2025-04-21 08:41:08.304 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  [Alarm Module]Host IP: a2aaf89b6607. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>
2025-04-21 08:41:08.304 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  [Alarm Module]Get ENV GS_CLUSTER_NAME failed!
2025-04-21 08:41:08.304 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57
2025-04-21 08:41:08.306 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  loaded library "security_plugin"
2025-04-21 08:41:08.306 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2025-04-21 08:41:08.306 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2025-04-21 08:41:08.308 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2025-04-21 08:41:08.308 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  reserved memory for backend threads is: 220 MB
2025-04-21 08:41:08.308 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  reserved memory for WAL buffers is: 128 MB
2025-04-21 08:41:08.308 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  Set max backend reserve memory is: 348 MB, max dynamic memory is: 8139 MB
2025-04-21 08:41:08.308 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  shared memory 3288 Mbytes, memory context 8487 Mbytes, max process memory 12288 Mbytes
2025-04-21 08:41:08.392 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [CACHE] LOG:  set data cache  size(402653184)
2025-04-21 08:41:08.453 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2025-04-21 08:41:08.502 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  gaussdb: fsync file "/var/lib/opengauss/data/gaussdb.state.temp" success
2025-04-21 08:41:08.502 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Normal), connection index(1)
2025-04-21 08:41:08.514 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  max_safe_fds = 982, usable_fds = 1000, already_open = 8
2025-04-21 08:41:08.520 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  the configure file /usr/local/opengauss/etc/gscgroup_omm.cfg doesn't exist or the size of configure file has changed. Please create it by root user!
2025-04-21 08:41:08.520 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  Failed to parse cgroup config file.
2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [EXECUTOR] WARNING:  Failed to obtain environment value $GAUSSLOG!
2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [EXECUTOR] DETAIL:  N/A
2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [EXECUTOR] CAUSE:  Incorrect environment value.
2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [EXECUTOR] ACTION:  Please refer to backend log for more details.
2025-04-21 08:41:08.554 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] FATAL:  ERROR: could not create instr log directory "pg_perf": Permission denied
2025-04-21 08:41:08.594 [unknown] [unknown] localhost 139652928928192 0[0:0#0]  0 [BACKEND] LOG:  FiniNuma allocIndex: 0.

解决情况

[失败]思路一:调整部署目录以及挂载目录的权限。根据这个结果,去调整了数据库挂载的data目录权限,同时调整了整个服务部署位置的权限。但结果依然失败。
[失败]思路二:删除原数据库,重新部署。结果不管重新部署run多少次,都是失败的。
[成功]思路三:删除原服务,并且删除了对照的image,重新拉取image,重新部署。系统成功运行了起来。

声明:一代明君的小屋|版权所有,违者必究|如未注明,均为原创|本网站采用BY-NC-SA协议进行授权

转载:转载请注明原文链接 - 记一次华为高斯数据库部署事件 关于pg_perf的事件


欢迎来到我的小屋