Перейти к основному содержимому

Windows by Zabbix agent

Macros used

NameValue
{$AGENT.TIMEOUT}3m
{$CPU.INTERRUPT.CRIT.MAX}50
{$CPU.PRIV.CRIT.MAX}30
{$CPU.QUEUE.CRIT.MAX}3
{$CPU.UTIL.CRIT}90
{$IF.ERRORS.WARN}2
{$IF.UTIL.MAX}90
{$IFCONTROL}1
{$MEM.PAGE_SEC.CRIT.MAX}1000
{$MEM.PAGE_TABLE_CRIT.MIN}5000
{$MEMORY.UTIL.MAX}90
{$NET.IF.IFALIAS.MATCHES}.*
{$NET.IF.IFALIAS.NOT_MATCHES}CHANGE_THIS
{$NET.IF.IFDESCR.MATCHES}.*
{$NET.IF.IFDESCR.NOT_MATCHES}CHANGE_THIS
{$NET.IF.IFNAME.MATCHES}.*
{$NET.IF.IFNAME.NOT_MATCHES}Miniport|Virtual|Teredo|Kernel|Loopback|Bluetooth|HTTPS|6to4|QoS|Layer
{$SERVICE.NAME.MATCHES}^.*$
{$SERVICE.NAME.NOT_MATCHES}^(?:RemoteRegistry|MMCSS|gupdate|SysmonLog|clr_optimization_v.+|sppsvc|gpsvc|Pml Driver HPZ12|Net Driver HPZ12|MapsBroker|IntelAudioService|Intel(R) TPM Provisioning Service|dbupdate|DoSvc|CDPUserSvc_.+|WpnUserService_.+|OneSyncSvc_.+|WbioSrvc|BITS|tiledatamodelsvc|GISvc|ShellHWDetection|TrustedInstaller|TabletInputService|CDPSvc|wuauserv)$
{$SERVICE.STARTUPNAME.MATCHES}^(?:automatic|automatic delayed)$
{$SERVICE.STARTUPNAME.NOT_MATCHES}^(?:manual|disabled)$
{$SWAP.PFREE.MIN.WARN}20
{$SYSTEM.FUZZYTIME.MAX}60
{$VFS.DEV.DEVNAME.MATCHES}.*
{$VFS.DEV.DEVNAME.NOT_MATCHES}_Total
{$VFS.DEV.READ.AWAIT.WARN}0.02
{$VFS.DEV.UTIL.MAX.WARN}95
{$VFS.DEV.WRITE.AWAIT.WARN}0.02
{$VFS.FS.FREE.MIN.CRIT}5G
{$VFS.FS.FREE.MIN.WARN}10G
{$VFS.FS.FSDRIVETYPE.MATCHES}fixed
{$VFS.FS.FSDRIVETYPE.NOT_MATCHES}^\s$
{$VFS.FS.FSNAME.MATCHES}.*
{$VFS.FS.FSNAME.NOT_MATCHES}^(?:/dev|/sys|/run|/proc|.+/shm$)
{$VFS.FS.FSTYPE.MATCHES}.*
{$VFS.FS.FSTYPE.NOT_MATCHES}^\s$
{$VFS.FS.PUSED.MAX.CRIT}90
{$VFS.FS.PUSED.MAX.WARN}80

Items collected

NameDescriptionTypeIntervalKey and additional info
Windows: Host name of Zabbix agent running

-

-

1hagent.hostname
Windows: Zabbix agent pingThe agent always returns 1 for this item. It could be used in combination with nodata() for availability check.

-

-

agent.ping
Windows: Version of Zabbix agent running

-

-

1hagent.version
Windows: Cache bytesCache Bytes is the sum of the Memory\System Cache Resident Bytes, Memory\System Driver Resident Bytes, Memory\System Code Resident Bytes, and Memory\Pool Paged Resident Bytes counters. This counter displays the last observed value only; it is not an average.

-

-

perf_counter_en["\Memory\Cache Bytes"]
Windows: Free system page table entriesThis indicates the number of page table entries not currently in use by the system. If the number is less than 5,000, there may well be a memory leak or you running out of memory.

-

-

perf_counter_en["\Memory\Free System Page Table Entries"]
Windows: Memory page faults per secondPage Faults/sec is the average number of pages faulted per second. It is measured in number of pages faulted per second because only one page is faulted in each fault operation, hence this is also equal to the number of page fault operations. This counter includes both hard faults (those that require disk access) and soft faults (where the faulted page is found elsewhere in physical memory.) Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays.

-

-

perf_counter_en["\Memory\Page Faults/sec"]
Windows: Memory pages per secondThis measures the rate at which pages are read from or written to disk to resolve hard page faults. If the value is greater than 1,000, as a result of excessive paging, there may be a memory leak.

-

-

perf_counter_en["\Memory\Pages/sec"]
Windows: Memory pool non-pagedThis measures the size, in bytes, of the non-paged pool. This is an area of system memory for objects that cannot be written to disk but instead must remain in physical memory as long as they are allocated. There is a possible memory leak if the value is greater than 175MB (or 100MB with the /3GB switch). A typical Event ID 2019 is recorded in the system event log.

-

-

perf_counter_en["\Memory\Pool Nonpaged Bytes"]
Windows: Used swap space in %The used space of swap volume/file in percent.

-

-

perf_counter_en["\Paging file(_Total)% Usage"]
Windows: CPU DPC timeProcessor DPC time is the time that a single processor spent receiving and servicing deferred procedure calls (DPCs). DPCs are interrupts that run at a lower priority than standard interrupts. % DPC Time is a component of % Privileged Time because DPCs are executed in privileged mode. If a high % DPC Time is sustained, there may be a processor bottleneck or an application or hardware related issue that can significantly diminish overall system performance.

-

-

perf_counter_en["\Processor Information(_total)% DPC Time"]
Windows: CPU interrupt timeThe Processor Information% Interrupt Time is the time the processor spends receiving and servicing hardware interrupts during sample intervals. This value is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards and other peripheral devices. This is an easy way to identify a potential hardware failure. This should never be higher than 20%.

-

-

perf_counter_en["\Processor Information(_total)% Interrupt Time"]
Windows: CPU privileged timeThe Processor Information% Privileged Time counter shows the percent of time that the processor is spent executing in Kernel (or Privileged) mode. Privileged mode includes services interrupts inside Interrupt Service Routines (ISRs), executing Deferred Procedure Calls (DPCs), Device Driver calls and other kernel-mode functions of the Windows® Operating System.

-

-

perf_counter_en["\Processor Information(_total)% Privileged Time"]
Windows: CPU user timeThe Processor Information% User Time counter shows the percent of time that the processor(s) is spent executing in User mode.

-

-

perf_counter_en["\Processor Information(_total)% User Time"]
Windows: Context switches per secondContext Switches/sec is the combined rate at which all processors on the computer are switched from one thread to another. Context switches occur when a running thread voluntarily relinquishes the processor, is preempted by a higher priority ready thread, or switches between user-mode and privileged (kernel) mode to use an Executive or subsystem service. It is the sum of Thread\Context Switches/sec for all threads running on all processors in the computer and is measured in numbers of switches. There are context switch counters on the System and Thread objects. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval.

-

-

perf_counter_en["\System\Context Switches/sec"]
Windows: CPU queue lengthThe Processor Queue Length shows the number of threads that are observed as delayed in the processor Ready Queue and are waiting to be executed.

-

-

perf_counter_en["\System\Processor Queue Length"]
Windows: Number of threadsThe number of threads used by all running processes.

-

-

perf_counter_en["\System\Threads"]
Windows: Number of processesThe number of processes.

-

-

proc.num[]
Windows: CPU utilizationThe CPU utilization expressed in %.

-

-

system.cpu.util
Windows: System nameThe host name of the system.

-

1hsystem.hostname
Windows: System local timeThe local system time of the host.

-

-

system.localtime
Windows: Operating system architectureThe architecture of the operating system.

-

1hsystem.sw.arch
Windows: Operating system

-

-

1hsystem.sw.os
Windows: Free swap spaceThe free space of the swap volume/file expressed in bytes.CALCULATED

-

system.swap.free
Windows: Free swap space in %The free space of the swap volume/file expressed in %.DEPENDENT

-

system.swap.pfree
Windows: Total swap spaceThe total space of the swap volume/file expressed in bytes.

-

-

system.swap.size[,total]
Windows: System descriptionSystem description of the host.

-

15msystem.uname
Windows: UptimeThe system uptime expressed in the following format: "N days, hh:mm:ss".

-

30ssystem.uptime
Windows: Get filesystemsThe vfs.fs.get key acquires raw information set about the file systems. Later to be extracted by preprocessing in dependent items.

-

-

vfs.fs.get
Windows: Total memoryThe total memory expressed in bytes.

-

-

vm.memory.size[total]
Windows: Used memoryUsed memory in bytes.

-

-

vm.memory.size[used]
Windows: Memory utilizationMemory utilization in %.CALCULATED

-

vm.memory.util
Windows: Network interfaces WMI getRaw data of win32_networkadapter.

-

-

wmi.getall[root\cimv2,"select Name,Description,NetConnectionID,Speed,AdapterTypeId,NetConnectionStatus,GUID from win32_networkadapter where PhysicalAdapter=True and NetConnectionStatus>0"]
Windows: Number of coresThe number of logical processors available on the computer.

-

-

wmi.get[root/cimv2,"Select NumberOfLogicalProcessors from Win32_ComputerSystem"]
Windows: Zabbix agent availabilityMonitoring the availability status of the agent.INTERNAL

-

zabbix[host,agent,available]

Triggers

NameDescriptionExpressionPriorityDependencies
Windows: Number of free system page table entries is too lowThe Memory Free System Page Table Entries is less than {$MEM.PAGE_TABLE_CRIT.MIN} for 5 minutes. If the number is less than 5,000, there may well be a memory leak.max(/Windows by Zabbix agent/perf_counter_en["\Memory\Free System Page Table Entries"],5m)<{$MEM.PAGE_TABLE_CRIT.MIN}WARNING 📢Windows: Free system page table entries
Windows: The Memory Pages/sec is too highThe Memory Pages/sec in the last 5 minutes exceeds {$MEM.PAGE_SEC.CRIT.MAX}. If the value is greater than 1,000, as a result of excessive paging, there may be a memory leak.min(/Windows by Zabbix agent/perf_counter_en["\Memory\Pages/sec"],5m)>{$MEM.PAGE_SEC.CRIT.MAX}WARNING 📢Windows: Memory pages per second
Windows: CPU interrupt time is too high"The CPU Interrupt Time in the last 5 minutes exceeds {$CPU.INTERRUPT.CRIT.MAX}%." The Processor Information% Interrupt Time is the time the processor spends receiving and servicing hardware interrupts during sample intervals. This value is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards and other peripheral devices. This is an easy way to identify a potential hardware failure. This should never be higher than 20%.min(/Windows by Zabbix agent/perf_counter_en["\Processor Information(_total)% Interrupt Time"],5m)>{$CPU.INTERRUPT.CRIT.MAX}WARNING 📢Windows: CPU interrupt time
Windows: CPU privileged time is too highThe CPU privileged time in the last 5 minutes exceeds {$CPU.PRIV.CRIT.MAX}%.min(/Windows by Zabbix agent/perf_counter_en["\Processor Information(_total)% Privileged Time"],5m)>{$CPU.PRIV.CRIT.MAX}WARNING 📢Windows: CPU privileged time
Windows: High CPU utilizationThe CPU utilization is too high. The system might be slow to respond.min(/Windows by Zabbix agent/system.cpu.util,5m)>{$CPU.UTIL.CRIT}WARNING 📢Windows: CPU utilization
Windows: System name has changedThe name of the system has changed. Acknowledge to close the problem manually.change(/Windows by Zabbix agent/system.hostname) and length(last(/Windows by Zabbix agent/system.hostname))>0INFO 🔔Windows: System name
Windows: System time is out of syncThe host's system time is different from Zabbix server time.fuzzytime(/Windows by Zabbix agent/system.localtime,{$SYSTEM.FUZZYTIME.MAX})=0WARNING 📢Windows: System local time
Windows: Operating system description has changedThe description of the operating system has changed. Possible reasons are that the system has been updated or replaced. Acknowledge to close the problem manually.change(/Windows by Zabbix agent/system.sw.os) and length(last(/Windows by Zabbix agent/system.sw.os))>0INFO 🔔Windows: Operating system
Windows: Host has been restartedThe device uptime is less than 10 minutes.last(/Windows by Zabbix agent/system.uptime)<10mWARNING 📢Windows: Uptime
Windows: High memory utilizationThe system is running out of free memory.min(/Windows by Zabbix agent/vm.memory.util,5m)>{$MEMORY.UTIL.MAX}AVERAGE ⚠Windows: Memory utilization
Windows: Zabbix agent is not availableFor passive only agents, host availability is used with {$AGENT.TIMEOUT} as time threshold.max(/Windows by Zabbix agent/zabbix[host,agent,available],{$AGENT.TIMEOUT})=0AVERAGE ⚠Windows: Zabbix agent availability

Discovery rule №1

NameDescriptionTypeIntervalKey and additional info
Network interfaces discoveryDiscovery of installed network interfaces.DEPENDENT0net.if.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Interface {#IFNAME}({#IFALIAS}): Inbound packets discardedThe number of incoming packets dropped on the network interface.

-

3mnet.if.in["{#IFGUID}",dropped]
Interface {#IFNAME}({#IFALIAS}): Inbound packets with errorsThe number of incoming packets with errors on the network interface.

-

3mnet.if.in["{#IFGUID}",errors]
Interface {#IFNAME}({#IFALIAS}): Bits receivedIncoming traffic on the network interface.

-

3mnet.if.in["{#IFGUID}"]
Interface {#IFNAME}({#IFALIAS}): Outbound packets discardedThe number of outgoing packets dropped on the network interface.

-

3mnet.if.out["{#IFGUID}",dropped]
Interface {#IFNAME}({#IFALIAS}): Outbound packets with errorsThe number of outgoing packets with errors on the network interface.

-

3mnet.if.out["{#IFGUID}",errors]
Interface {#IFNAME}({#IFALIAS}): Bits sentOutgoing traffic on the network interface.

-

3mnet.if.out["{#IFGUID}"]
Interface {#IFNAME}({#IFALIAS}): SpeedEstimated bandwidth of the network interface if any.DEPENDENT

-

net.if.speed["{#IFGUID}"]
Interface {#IFNAME}({#IFALIAS}): Operational statusThe operational status of the network interface.DEPENDENT

-

net.if.status["{#IFGUID}"]
Interface {#IFNAME}({#IFALIAS}): Interface typeThe type of the network interface.DEPENDENT

-

net.if.type["{#IFGUID}"]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
Interface {#IFNAME}({#IFALIAS}): Link downThis trigger expression works as follows: 1. It can be triggered if the operations status is down. 2. &#123;$IFCONTROL:"&#123;#IFNAME&#125;"&#125;=1 - a user can redefine context macro to value - 0. That marks this interface as not important. No new trigger will be fired if this interface is down. 3. &#123;TEMPLATE_NAME:METRIC.diff()&#125;=1 - the trigger fires only if the operational status was up to (1) sometime before (so, do not fire for the 'eternal off' interfaces.) WARNING: if closed manually - it will not fire again on the next poll, because of .diff.{$IFCONTROL:"{#IFNAME}"}=1 and last(/Windows by Zabbix agent/net.if.status["{#IFGUID}"])<>2 and (last(/Windows by Zabbix agent/net.if.status["{#IFGUID}"],#1)<>last(/Windows by Zabbix agent/net.if.status["{#IFGUID}"],#2))AVERAGE ⚠Interface {#IFNAME}({#IFALIAS}): Operational status

Discovery rule №2

NameDescriptionTypeIntervalKey and additional info
Physical disks discoveryDiscovery of installed physical disks.

-

1hperf_instance_en.discovery[PhysicalDisk]

Item prototypes

NameDescriptionTypeIntervalKey and additional info
{#DEVNAME}: Disk utilization by idle timeThis item is the percentage of elapsed time that the selected disk drive was busy servicing read or writes requests based on idle time.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})% Idle Time",60]
{#DEVNAME}: Average disk read queue lengthAverage disk read queue, the number of requests outstanding on the disk at the time the performance data is collected.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk Read Queue Length",60]
{#DEVNAME}: Disk read request avg waiting timeThe average time for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk sec/Read",60]
{#DEVNAME}: Disk write request avg waiting timeThe average time for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk sec/Write",60]
{#DEVNAME}: Average disk write queue lengthAverage disk write queue, the number of requests outstanding on the disk at the time the performance data is collected.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk Write Queue Length",60]
{#DEVNAME}: Disk average queue size (avgqu-sz)The current average disk queue; the number of requests outstanding on the disk while the performance data is being collected.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Current Disk Queue Length",60]
{#DEVNAME}: Disk read rateRate of read operations on the disk.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Disk Reads/sec",60]
{#DEVNAME}: Disk write rateRate of write operations on the disk.

-

-

perf_counter_en["\PhysicalDisk({#DEVNAME})\Disk Writes/sec",60]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
{#DEVNAME}: Disk is overloadedThe disk appears to be under heavy load.min(/Windows by Zabbix agent/perf_counter_en["\PhysicalDisk({#DEVNAME})% Idle Time",60],15m)>{$VFS.DEV.UTIL.MAX.WARN}WARNING 📢{#DEVNAME}: Disk utilization by idle time
{#DEVNAME}: Disk read request responses are too highThis trigger might indicate the disk {#DEVNAME} saturation.min(/Windows by Zabbix agent/perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk sec/Read",60],15m) > {$VFS.DEV.READ.AWAIT.WARN:"{#DEVNAME}"}WARNING 📢{#DEVNAME}: Disk read request avg waiting time
{#DEVNAME}: Disk write request responses are too highThis trigger might indicate the disk {#DEVNAME} saturation.min(/Windows by Zabbix agent/perf_counter_en["\PhysicalDisk({#DEVNAME})\Avg. Disk sec/Write",60],15m) > {$VFS.DEV.WRITE.AWAIT.WARN:"{#DEVNAME}"}WARNING 📢{#DEVNAME}: Disk write request avg waiting time

Discovery rule №3

NameDescriptionTypeIntervalKey and additional info
Windows services discoveryDiscovery of Windows services of different types as defined in template's macros.

-

1hservice.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
State of service "{#SERVICE.NAME}" ({#SERVICE.DISPLAYNAME})

-

-

-

service.info["{#SERVICE.NAME}",state]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
"{#SERVICE.NAME}" ({#SERVICE.DISPLAYNAME}) is not runningThe service has a state other than "Running" for the last three times.min(/Windows by Zabbix agent/service.info["{#SERVICE.NAME}",state],#3)<>0AVERAGE ⚠State of service "{#SERVICE.NAME}" ({#SERVICE.DISPLAYNAME})

Discovery rule №4

NameDescriptionTypeIntervalKey and additional info
Mounted filesystem discoveryDiscovery of file systems of different types.DEPENDENT0vfs.fs.dependent.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
{#FSLABEL}({#FSNAME}): Space utilizationSpace utilization expressed in % for &#123;#FSNAME&#125;.DEPENDENT

-

vfs.fs.dependent.size[{#FSNAME},pused]
{#FSLABEL}({#FSNAME}): Total spaceThe total space expressed in bytes.DEPENDENT

-

vfs.fs.dependent.size[{#FSNAME},total]
{#FSLABEL}({#FSNAME}): Used spaceUsed storage expressed in bytes.DEPENDENT

-

vfs.fs.dependent.size[{#FSNAME},used]
{#FSLABEL}({#FSNAME}): Get filesystem data

-

DEPENDENT

-

vfs.fs.dependent[{#FSNAME},data]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
{#FSLABEL}({#FSNAME}): Disk space is critically lowTwo conditions should match: 1. The first condition - utilization of the space should be above &#123;$VFS.FS.PUSED.MAX.CRIT:"&#123;#FSNAME&#125;"&#125;. 2. The second condition should be one of the following: - the disk free space is less than &#123;$VFS.FS.FREE.MIN.CRIT:"&#123;#FSNAME&#125;"&#125;; - the disk will be full in less than 24 hours.last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.CRIT:"{#FSNAME}"} and ((last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},total])-last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.CRIT:"{#FSNAME}"} or timeleft(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],1h,100)<1d)AVERAGE ⚠
{#FSLABEL}({#FSNAME}): Disk space is lowTwo conditions should match: 1. The first condition - utilization of the space should be above &#123;$VFS.FS.PUSED.MAX.WARN:"&#123;#FSNAME&#125;"&#125;. 2. The second condition should be one of the following: - the disk free space is less than &#123;$VFS.FS.FREE.MIN.WARN:"&#123;#FSNAME&#125;"&#125;; - the disk will be full in less than 24 hours.last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused])>{$VFS.FS.PUSED.MAX.WARN:"{#FSNAME}"} and ((last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},total])-last(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},used]))<{$VFS.FS.FREE.MIN.WARN:"{#FSNAME}"} or timeleft(/Windows by Zabbix agent/vfs.fs.dependent.size[{#FSNAME},pused],1h,100)<1d)WARNING 📢