LDAP cluster monitoring
The LDAP cluster supports dual-master and master-slave modes, and currently we are using the master-slave mode.
Monitoring indicators:
- Master-slave service survival
- Master-slave synchronization monitoring
Monitoring ideas
- zabbix monitors script or service status, master-slave status
- prometheus open source export to obtain the working status of ldap [third-party client software, many monitoring items, but not streamlined enough]
Realization of monitoring
Use the monitoring script to pass the survival information of the master-slave ldap service and the monitoring information of the master-slave synchronization to zabbix to realize the survival and synchronization monitoring of the master-slave service.
Reference: https://github.com/MrCirca/OpenLDAP-Cluster-Zabbix
Service Survival Monitoring Implementation
principle
The service port of ldap is port 389. During normal work, we can use ldap commands to remotely access the master and slave hosts. If the access is normal, the service is judged to be normal.
## slave实例上面访问
ldapsearch -Q -LLL -Y EXTERNAL -H ldapi:/// -s base -b "$BASE_DN" contextCSN 2> /dev/null
echo $?
##若slave实例异常,则返回非0
## slave实例访问master的接口
ldapsearch -Q -LLL -x -H ldap://"$PROVIDER_URI" -Y EXTERNAL -H ldapi:/// -s base -b "$BASE_DN" contextCSN 2> /dev/null
echo $?
##若master实例异常,则返回非0
Realization of master-slave synchronization monitoring
principle
LDAP will become a contextCSN value after master-slave replication, and this value will be recalculated every time it is updated. When the values of the master instance and the slave instance are the same, it is judged that the master-slave synchronization is normal
# 获取slave实例的contextCSN值
[root@hn-nameserver02-2-205 ~]# ldapsearch -LLL -x -s base -b "dc=local,dc=cn" contextCSN
dn: dc=local,dc=cn
contextCSN: 20190514075500.340075Z#000000#000#000000
# 获取master实例的contextCSN值
[root@hn-nameserver02-2-205 ~]# ldapsearch -LLL -x -H ldap://172.16.2.204 -s base -b "dc=local,dc=cn" contextCSN
dn: dc=local,dc=cn
contextCSN: 20190514075500.340075Z#000000#000#000000
contextCSN: 20190514075500.340075Z#000000#000#000000
Monitoring script information:
status check script
[root@hn-nameserver02-2-205 ~]# cat /etc/zabbix/external_scripts/ldap_check_status.sh
#!/bin/bash
BASE_DN=$(ldapsearch -Q -LLL -Y EXTERNAL -H ldapi:/// -b cn=config 2> /dev/null | grep "olcSuffix:" | cut -d " " -f 2)
PROVIDER_URI=$1
LDAP_CSN_CONSUMER_COMMAND=$(ldapsearch -Q -LLL -Y EXTERNAL -H ldapi:/// -s base -b "$BASE_DN" contextCSN 2> /dev/null)
LDAP_CSN_CONSUMER_RC=$?
LDAP_CSN_PROVIDER_COMMAND=$(ldapsearch -LLL -x -H ldap://"$PROVIDER_URI" -s base -b "$BASE_DN" contextCSN 2> /dev/null)
LDAP_CSN_PROVIDER_RC=$?
PROVIDER_CSN=$(echo -e $LDAP_CSN_PROVIDER_COMMAND | grep contextCSN | cut -d " " -f 2)
CONSUMER_CSN=$(echo -e $LDAP_CSN_CONSUMER_COMMAND | grep contextCSN | cut -d " " -f 2)
if [[ "$LDAP_CSN_CONSUMER_RC" != "0" ]] && [[ "$LDAP_CSN_PROVIDER_RC" == "0" ]]; then
echo "1"
elif [[ "$LDAP_CSN_PROVIDER_RC" != "0" ]] && [[ "$LDAP_CSN_CONSUMER_RC" == "0" ]]; then
echo "2"
elif [[ "$LDAP_CSN_PROVIDER_RC" != "0" ]] && [[ "$LDAP_CSN_CONSUMER_RC" != "0" ]]; then
echo "3"
elif [[ "$PROVIDER_CSN" == "$CONSUMER_CSN" ]]; then
echo "4"
else
echo "100"
fi
zabbix custom monitoring items
[root@hn-nameserver02-2-205 ~]# tail -n1 /etc/zabbix/zabbix_agentd.conf
UserParameter=ldap.clusterstatus[*],/etc/zabbix/external_scripts/ldap_check_status.sh $1
Monitoring instructions
- Execute monitoring on the slave node, and execute the monitoring ip of the master in the monitoring item
- When the status detection script returns 1, it means that the slave instance is unavailable
- When the status detection script returns 2, it means that the master instance is unavailable
- When the status detection script returns 3, it means that the master and slave are out of sync
PS: It is recommended to import the zabbix template and modify the value of the master instance