2025/12/30 3:41:07
网站建设
项目流程
网站开发风险,青岛 网站设计,原创文学网站建设,镇江发布微信公众号前言
在IT运维中#xff0c;监控系统是保障业务稳定运行的核心基础设施。Zabbix作为一款功能强大的开源监控解决方案#xff0c;支持服务器、网络设备、应用程序等全方位监控。本文将带你从零搭建企业级Zabbix监控平台。
一、为什么选择Zabbix
1.1 Zabbix vs 其他监控方案…前言在IT运维中监控系统是保障业务稳定运行的核心基础设施。Zabbix作为一款功能强大的开源监控解决方案支持服务器、网络设备、应用程序等全方位监控。本文将带你从零搭建企业级Zabbix监控平台。一、为什么选择Zabbix1.1 Zabbix vs 其他监控方案特性ZabbixPrometheusNagios数据存储关系型数据库时序数据库文件配置方式Web界面配置文件配置文件自动发现✅ 强大⭕ 需配合❌分布式✅ Proxy⭕ Federation❌告警通知✅ 丰富⭕ Alertmanager✅学习曲线中等较高较低适用场景传统运维云原生小规模1.2 Zabbix架构┌─────────────────┐ │ Zabbix Server │ │ (核心) │ └────────┬────────┘ │ ┌────────────────────┼────────────────────┐ │ │ │ ┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐ │ Zabbix Agent│ │ Zabbix Proxy│ │ SNMP │ │ (主动/被动)│ │ (分布式) │ │ 设备 │ └─────────────┘ └─────────────┘ └─────────────┘二、Docker Compose快速部署2.1 完整部署配置# docker-compose.ymlversion:3.8services:mysql:image:mysql:8.0container_name:zabbix-mysqlcommand:-mysqld---character-set-serverutf8mb4---collation-serverutf8mb4_bin---default-authentication-pluginmysql_native_passwordenvironment:MYSQL_DATABASE:zabbixMYSQL_USER:zabbixMYSQL_PASSWORD:zabbix_passwordMYSQL_ROOT_PASSWORD:root_passwordvolumes:-zabbix_mysql_data:/var/lib/mysqlnetworks:-zabbix-netrestart:unless-stoppedzabbix-server:image:zabbix/zabbix-server-mysql:ubuntu-7.0-latestcontainer_name:zabbix-serverenvironment:DB_SERVER_HOST:mysqlMYSQL_DATABASE:zabbixMYSQL_USER:zabbixMYSQL_PASSWORD:zabbix_passwordMYSQL_ROOT_PASSWORD:root_passwordZBX_CACHESIZE:128MZBX_STARTPOLLERS:10ZBX_STARTPINGERS:5ports:-10051:10051volumes:-zabbix_server_alertscripts:/usr/lib/zabbix/alertscripts-zabbix_server_externalscripts:/usr/lib/zabbix/externalscriptsnetworks:-zabbix-netdepends_on:-mysqlrestart:unless-stoppedzabbix-web:image:zabbix/zabbix-web-nginx-mysql:ubuntu-7.0-latestcontainer_name:zabbix-webenvironment:ZBX_SERVER_HOST:zabbix-serverDB_SERVER_HOST:mysqlMYSQL_DATABASE:zabbixMYSQL_USER:zabbixMYSQL_PASSWORD:zabbix_passwordPHP_TZ:Asia/Shanghaiports:-8080:8080networks:-zabbix-netdepends_on:-mysql-zabbix-serverrestart:unless-stoppedzabbix-agent:image:zabbix/zabbix-agent:ubuntu-7.0-latestcontainer_name:zabbix-agentenvironment:ZBX_HOSTNAME:Zabbix serverZBX_SERVER_HOST:zabbix-serverZBX_SERVER_PORT:10051ports:-10050:10050networks:-zabbix-netdepends_on:-zabbix-serverrestart:unless-stoppedvolumes:zabbix_mysql_data:zabbix_server_alertscripts:zabbix_server_externalscripts:networks:zabbix-net:driver:bridge2.2 启动与访问# 启动服务docker-compose up -d# 查看日志docker-compose logs -f zabbix-server# 访问Web界面# http://localhost:8080# 默认账号: Admin / zabbix三、Agent部署与配置3.1 Linux Agent安装# Ubuntu/Debianwgethttps://repo.zabbix.com/zabbix/7.0/ubuntu/pool/main/z/zabbix-release/zabbix-release_7.0-1ubuntu$(lsb_release -rs)_all.deb dpkg -i zabbix-release_7.0-1ubuntu$(lsb_release -rs)_all.debaptupdateaptinstallzabbix-agent2# CentOS/RHELrpm-Uvh https://repo.zabbix.com/zabbix/7.0/rhel/8/x86_64/zabbix-release-7.0-1.el8.noarch.rpm dnfinstallzabbix-agent23.2 Agent配置# /etc/zabbix/zabbix_agent2.conf# Zabbix Server地址Server192.168.1.100# 主动模式Server地址ServerActive192.168.1.100# Agent主机名需与Zabbix Web中配置一致Hostnameweb-server-01# 日志级别DebugLevel3# 允许远程命令谨慎开启# EnableRemoteCommands1# 自定义监控项UserParametercustom.cpu.count,nprocUserParametercustom.memory.available,free -m|awk/Mem:/{print$7}3.3 启动Agentsystemctlenablezabbix-agent2 systemctl start zabbix-agent2 systemctl status zabbix-agent23.4 Windows Agent# 下载Windows Agent MSI包安装# https://www.zabbix.com/download_agents# 配置文件位置# C:\Program Files\Zabbix Agent 2\zabbix_agent2.conf# 服务管理netstartZabbix Agent 2四、监控配置实战4.1 添加主机Configuration → Hosts → Create host Host name: web-server-01 Visible name: Web服务器01 Groups: Linux servers Interfaces: - Agent: 192.168.1.10:10050 Templates: - Linux by Zabbix agent - Nginx by Zabbix agent4.2 自定义监控项# 监控Nginx连接数UserParameternginx.connections.active,curl -s http://localhost/nginx_status|awk/Active/{print$3}# 监控Docker容器数UserParameterdocker.containers.running,dockerps-q|wc-l# 监控磁盘IOUserParameterdisk.io.read[*],iostat -d$112|tail-1|awk{print$3}UserParameterdisk.io.write[*],iostat -d$112|tail-1|awk{print$4}# 重启Agent生效systemctl restart zabbix-agent24.3 自动发现规则# 自动发现磁盘分区Key:vfs.fs.discoveryFilter:{#FSTYPE} matches (ext4|xfs|btrfs)# 自动发现Docker容器Key:docker.containers.discovery# 需要自定义脚本自动发现脚本示例#!/usr/bin/env python3# /usr/lib/zabbix/externalscripts/docker_discovery.pyimportsubprocessimportjsondefdiscover_containers():resultsubprocess.run([docker,ps,--format,{{.Names}}],capture_outputTrue,textTrue)containersresult.stdout.strip().split(\n)data[]forcontainerincontainers:ifcontainer:data.append({{#CONTAINER}:container})returnjson.dumps({data:data})if__name____main__:print(discover_containers())五、告警配置5.1 触发器表达式# CPU使用率超过80%持续5分钟 avg(/web-server-01/system.cpu.util,5m)80 # 内存使用率超过90% last(/web-server-01/vm.memory.util)90 # 磁盘空间不足10% last(/web-server-01/vfs.fs.pused[/])90 # 服务不可达 nodata(/web-server-01/agent.ping,5m)1 # Nginx进程不存在 last(/web-server-01/proc.num[nginx])05.2 邮件告警配置# 安装邮件工具aptinstallmailutils# 创建告警脚本cat/usr/lib/zabbix/alertscripts/send_email.shEOF #!/bin/bash TO$1 SUBJECT$2 BODY$3 echo $BODY | mail -s $SUBJECT $TO EOFchmodx /usr/lib/zabbix/alertscripts/send_email.shWeb配置Administration → Media types → Create media type Name: Email Type: Script Script name: send_email.sh Script parameters: {ALERT.SENDTO} {ALERT.SUBJECT} {ALERT.MESSAGE}5.3 企业微信告警#!/usr/bin/env python3# /usr/lib/zabbix/alertscripts/wechat.pyimportsysimportrequestsimportjson# 企业微信配置CORPIDyour_corpidSECRETyour_secretAGENTID1000002defget_token():urlfhttps://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid{CORPID}corpsecret{SECRET}resprequests.get(url)returnresp.json()[access_token]defsend_message(user,subject,message):tokenget_token()urlfhttps://qyapi.weixin.qq.com/cgi-bin/message/send?access_token{token}data{touser:user,msgtype:text,agentid:AGENTID,text:{content:f【{subject}】\n{message}}}resprequests.post(url,jsondata)returnresp.json()if__name____main__:usersys.argv[1]subjectsys.argv[2]messagesys.argv[3]resultsend_message(user,subject,message)print(result)5.4 告警升级Actions → Create action Conditions: - Trigger severity High Operations: Step 1 (0-30min): 发送给一线值班 Step 2 (30-60min): 发送给二线负责人 Step 3 (60min): 发送给技术总监 电话告警六、分布式监控Proxy6.1 Proxy部署# docker-compose-proxy.ymlversion:3.8services:zabbix-proxy:image:zabbix/zabbix-proxy-sqlite3:ubuntu-7.0-latestcontainer_name:zabbix-proxyenvironment:ZBX_PROXYMODE:0# 0主动模式ZBX_HOSTNAME:proxy-beijingZBX_SERVER_HOST:192.168.1.100# Zabbix Server地址ZBX_CONFIGFREQUENCY:300ports:-10051:10051volumes:-zabbix_proxy_data:/var/lib/zabbixrestart:unless-stoppedvolumes:zabbix_proxy_data:6.2 多站点监控架构┌─────────────────────────────────────┐ │ 总部 Zabbix Server │ │ 192.168.1.100 │ └──────────────┬──────────────────────┘ │ ┌─────────────────────┼─────────────────────┐ │ │ │ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ │ 北京Proxy │ │ 上海Proxy │ │ 广州Proxy │ │10.1.0.100 │ │10.2.0.100 │ │10.3.0.100 │ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │ │ │ [北京Agent] [上海Agent] [广州Agent]6.3 跨网络Proxy通信当分公司与总部网络不通时传统方案需要开放公网端口安全风险VPN隧道配置复杂更简单的方案是使用组网软件如星空组网将各站点组成虚拟局域网Proxy就能直接与Server通信# Proxy配置 ZBX_SERVER_HOST10.26.0.1 # Server的虚拟内网IP # 所有站点在同一个虚拟网络中 # 无需公网IP无需端口映射七、性能优化7.1 Server配置优化# /etc/zabbix/zabbix_server.conf# 缓存配置CacheSize256MHistoryCacheSize64MHistoryIndexCacheSize16MTrendCacheSize16MValueCacheSize64M# 进程配置StartPollers20StartPollersUnreachable5StartPingers10StartDiscoverers5StartHTTPPollers5# 数据库优化DBHostlocalhostDBSocket/var/run/mysqld/mysqld.sock# 使用Unix Socket比TCP快7.2 数据库优化-- 分区表按天分区历史数据ALTERTABLEhistoryPARTITIONBYRANGE(clock)(PARTITIONp202512VALUESLESS THAN(UNIX_TIMESTAMP(2025-01-01)));-- 定期清理旧数据DELETEFROMhistoryWHEREclockUNIX_TIMESTAMP(NOW()-INTERVAL90DAY);-- 优化查询OPTIMIZETABLEhistory;ANALYZETABLEhistory;7.3 监控项优化# 合理设置采集间隔 - 关键指标: 30s-1m - 普通指标: 5m - 趋势数据: 30m # 避免过度监控 - 每台主机监控项控制在100-200个 - 使用主动模式减轻Server压力 - 合理使用低级别发现规则八、常用监控模板8.1 MySQL监控# Agent配置UserParametermysql.status[*],mysqladmin -umonitor -ppassword extended-status2/dev/null|awk/$1/{print$$4}UserParametermysql.ping,mysqladmin -umonitor -ppasswordping2/dev/null|grep-c aliveUserParametermysql.version,mysql -V2/dev/null|awk{print$5}8.2 Redis监控# Agent配置UserParameterredis.info[*],redis-cli -h localhost -p6379INFO$12/dev/null|grep-w$2|cut-d: -f2UserParameterredis.ping,redis-cli -h localhostping2/dev/null|grep-c PONG8.3 Nginx监控# nginx.conf - 添加状态页 server { location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; deny all; } }# Agent配置UserParameternginx.active,curl -s http://127.0.0.1/nginx_status|awk/Active/{print$3}UserParameternginx.accepts,curl -s http://127.0.0.1/nginx_status|awkNR3{print$1}UserParameternginx.handled,curl -s http://127.0.0.1/nginx_status|awkNR3{print$2}UserParameternginx.requests,curl -s http://127.0.0.1/nginx_status|awkNR3{print$3}九、Dashboard定制9.1 创建仪表盘Dashboards → Create dashboard 添加Widget: - 系统概览Problems by severity - Top主机Top hosts by CPU/Memory - 趋势图Graph - 地理分布Geomap - 服务状态Service status9.2 SLA报表Services → Create service Service name: 生产环境Web服务 SLA: 99.9% Status calculation: Most critical of child services Children: - Web Server 01 - Web Server 02 - Database Master十、总结Zabbix是一个功能完善的企业级监控平台本文覆盖了模块内容部署Docker Compose快速搭建AgentLinux/Windows部署配置监控项系统监控、自定义监控、自动发现告警邮件、企业微信、告警升级分布式Proxy部署、多站点监控优化Server调优、数据库优化运维建议从小规模开始逐步扩展监控范围合理设置告警阈值避免告警疲劳定期review监控项移除无用项做好历史数据归档和清理参考资料Zabbix官方文档https://www.zabbix.com/documentation/7.0/zhZabbix最佳实践https://www.zabbix.com/documentation/7.0/zh/manual/installation/requirements/best_practicesZabbix模板市场https://www.zabbix.com/integrations本文首发于CSDN转载请注明出处。