11 Sep 2009
nagios机器分组批量检测
我一直不太喜欢用nagios,主要原因是配置太过复杂,有些时候可能自己写几行脚本就能搞定的事情却要修改一堆配置,太麻烦了。插件虽然很多但也会常常需要二次开发,实在没有那个精力。
这里说说今天改配置文件时候发现的一些技巧,在监控大量服务器时会更省力一些
一。在nagios.cfg中加入
#groupfile
cfg_file=/usr/local/nagios/etc/objects/hostsgroup.cfg
cfg_file=/usr/local/nagios/etc/objects/groupservers.cfg
二。监控组模板配置,这个模板写好以后,可以在需要时调用,非常的方便。
/nagios/etc/objectsgroupservers.cfg
define service{
use local-service ; Name of service template to use
hostgroup_name Webserver
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
use local-service ; Name of service template to use
hostgroup_name Webserver
service_description SSH
check_command check_ssh
notifications_enabled 1
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use local-service ; Name of service template to use
hostgroup_name Webserver
service_description HTTP
check_command check_http
# notifications_enabled 0
}
#Define a service to check squid on the local machine
define service{
use local-service ; Name of service template to use
hostgroup_name Webserver
service_description SQUID
check_command check_squid
}
#CPU Load
define service{
use local-service
hostgroup_name Webserver
service_description CPU Load
check_command check_nrpe!check_load
}
#User Login
define service{
use local-service
hostgroup_name Webserver
service_description Current user
check_command check_nrpe!check_users
}
#Total Processes
define service{
use local-service
hostgroup_name Webserver
service_description Total Processes
check_command check_nrpe!check_total_procs
}
三。调用。在此处调用的机器,在hosts.cfg中一定要声明过,否则nagios将无法启动
define hostgroup{
hostgroup_name Webserver
alias Webserver ;在web界面显示的名字
members web1
}