AD DS Health and Performance
Macros used
| Name | Value |
|---|---|
| {$ADDB_PATH} | c:\windows\ntds\ntds.dit |
| {$ADLOG_PATH} | c:\windows\ntds\edb.log |
| {$ADSYSVOL_PATH} | c:\windows\SYSVOL |
| {$DNS.TOTAL.RECEIVED.MAX.WARN} | 1000 |
Items collected
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Active Directory Web Services Events | - | ZABBIX_ACTIVE | 5m | eventlog[Active Directory Web Services,,"Warning|Error|Critical"] |
| DFS Replication Events | - | ZABBIX_ACTIVE | 5m | eventlog[DFS Replication,,"Warning|Error|Critical"] |
| Directory Service Events | - | ZABBIX_ACTIVE | 5m | eventlog[Directory Service,,"Warning|Error|Critical"] |
| DNS Server Events | - | ZABBIX_ACTIVE | 5m | eventlog[DNS Server,,"Warning|Error|Critical"] |
| Microsoft-Windows-Backup Events | - | ZABBIX_ACTIVE | 5m | eventlog[Microsoft-Windows-Backup,,"Information|Error"] |
| Windows Time Service Events | - | ZABBIX_ACTIVE | 5m | eventlog[System,"Time-Service","Warning|Error|Critical"] |
| NETLOGON Events | - | ZABBIX_ACTIVE | 5m | eventlog[System,,"Error","NETLOGON",^5723$] |
| LogFile (Netlogon) | - | ZABBIX_ACTIVE | 5m | log[c:\windows\debug\netlogon.log,"NO_CLIENT_SITE",,,skip,,,] |
| MaxConcurrentApi | https://learn.microsoft.com/en-us/troubleshoot/windows-server/windows-security/performance-tuning-ntlm-authentication-maxconcurrentapi | CALCULATED | 5m | MaxConcurrentApi |
| LDAP Port is running | - | SIMPLE | - | net.tcp.service[ldap] |
| I/O Database Reads/sec | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads/sec] | - | - | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads/sec] |
| I/O Database Reads Average Latency | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads Average Latency] | - | - | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads Average Latency] |
| I/O Log Writes/sec | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Log Writes/sec] | - | - | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Log Writes/sec] |
| I/O Log Writes Average Latency | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Log Writes Average Latency] | - | - | perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Log Writes Average Latency] |
| DFS Replicated Folders (Conflict Space In Use) | - | - | 5m | perf_counter_en[\DFS Replicated Folders(*)\Conflict Space In Use] |
| DFS Replicated Folders (RDC Bytes Received) | - | - | 5m | perf_counter_en[\DFS Replicated Folders(*)\RDC Bytes Received] |
| DFS Replicated Folders (Size of Files Received) | - | - | 5m | perf_counter_en[\DFS Replicated Folders(*)\Size of Files Received] |
| DFS Replicated Folders (Total Files Received) | - | - | 5m | perf_counter_en[\DFS Replicated Folders(*)\Total Files Received] |
| DNS Total Query Received/sec | - | - | - | perf_counter_en[\DNS\Total Query Received/sec] |
| DNS Total Response Sent/sec | - | - | - | perf_counter_en[\DNS\Total Response Sent/sec] |
| DNS UDP Query Received/sec | - | - | - | perf_counter_en[\DNS\UDP Query Received/sec] |
| DNS UDP Response Sent/sec | - | - | - | perf_counter_en[\DNS\UDP Response Sent/sec] |
| Netlogon Average Semaphore Hold Time | perf_counter_en[\Netlogon(*)\Average Semaphore Hold Time] | - | - | perf_counter_en[\Netlogon(_Total)\Average Semaphore Hold Time] |
| Netlogon Semaphore Acquires | perf_counter_en[\Netlogon()\Semaphore Acquires] | - | - | perf_counter_en[\Netlogon(_Total)\Semaphore Acquires] |
| Netlogon Semaphore Timeouts | perf_counter_en[\Netlogon()\Semaphore Timeouts] | - | - | perf_counter_en[\Netlogon(_Total)\Semaphore Timeouts] |
| DRA Inbound Bytes Total/sec | perf_counter_en[\NTDA\DRA Inbound Bytes Total/sec] | - | - | perf_counter_en[\NTDS\DRA Inbound Bytes Total/sec] |
| DRA Inbound Object Updates Remaining in Packet | perf_counter_en[\NTDA\DRA Inbound Object Updates Remaining in Packet] | - | - | perf_counter_en[\NTDS\DRA Inbound Object Updates Remaining in Packet] |
| DRA Outbound Bytes Total/sec | perf_counter_en[\NTDA\DRA Outbound Bytes Total/sec] | - | - | perf_counter_en[\NTDS\DRA Outbound Bytes Total/sec] |
| DRA Pending Replication Synchronizations | perf_counter_en[\NTDA\DRA Pending Replication Synchronizations] | - | - | perf_counter_en[\NTDS\DRA Pending Replication Synchronizations] |
| LDAP Active Threads | perf_counter_en[\NTDS\LDAP Active Threads] | - | - | perf_counter_en[\NTDS\LDAP Active Threads] |
| LDAP Client Sessions | - | - | - | perf_counter_en[\NTDS\LDAP Client Sessions] |
| LDAP New Connections/sec | - | - | - | perf_counter_en[\NTDS\LDAP New Connections/sec] |
| LDAP New SSL Connections/sec | - | - | - | perf_counter_en[\NTDS\LDAP New SSL Connections/sec] |
| LDAP Searches/sec | perf_counter_en[\NTDS\LDAP Searches/sec] | - | - | perf_counter_en[\NTDS\LDAP Searches/sec] |
| LDAP Writes/sec | - | - | - | perf_counter_en[\NTDS\LDAP Writes/sec] |
| LSASS Processor Time | Metric Process % Processor Time of LSASS perf_counter_en[\Process(lsass)% Processor Time] | - | - | perf_counter_en[\Process(lsass)% Processor Time] |
| Kerberos Authentications | perf_counter[\Security system-wide statistics\Kerberos Authentications] | - | - | perf_counter_en[\Security system-wide statistics\Kerberos Authentications] |
| NTLM Authentications | perf_counter_en[\Security system-wide statistics\NTLM Authentications] | - | - | perf_counter_en[\Security system-wide statistics\NTLM Authentications] |
| State of service "ADWS" (Active Directory Web Services) | - | - | - | service.info[ADWS,state] |
| State of service "DFSR" (DFS Replication) | - | - | - | service.info[DFSR,state] |
| State of service "DNS" (DNS Server) | - | - | - | service.info[DNS,state] |
| State of service "Dnscache" (DNS Client) | - | - | - | service.info[Dnscache,state] |
| State of service "IsmServ" (Intersite Messaging) | - | - | - | service.info[IsmServ,state] |
| State of service "Kdc" (Kerberos Key Distribution Center) | - | - | - | service.info[Kdc,state] |
| State of service "LanmanServer" (Server) | - | - | - | service.info[LanmanServer,state] |
| State of service "LanmanWorkstation" (Workstation) | - | - | - | service.info[LanmanWorkstation,state] |
| State of service "Netlogon" (Netlogon) | - | - | - | service.info[Netlogon,state] |
| State of service "NTDS" (Active Directory Domain Services) | - | - | - | service.info[NTDS,state] |
| State of service "RpcSs" (Remote Procedure Call (RPC)) | - | - | - | service.info[RpcSs,state] |
| State of service "SamSs" (Security Accounts Manager) | - | - | - | service.info[SamSs,state] |
| State of service "W32Time" (Windows Time) | - | - | - | service.info[W32Time,state] |
| SYSVOL Size | - | - | 12h | vfs.dir.size["{$ADSYSVOL_PATH}"] |
| Database Size | - | - | 1h | vfs.file.size["{$ADDB_PATH}"] |
| Log File Size | - | - | 1h | vfs.file.size["{$ADLOG_PATH}"] |
Triggers
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| Active Directory Web Services Error on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[Active Directory Web Services,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[Active Directory Web Services,,"Warning|Error|Critical"],1800s)=0 | - | Active Directory Web Services Events |
| DFS Replication Events Error on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[DFS Replication,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[DFS Replication,,"Warning|Error|Critical"],1800s)=0 | - | DFS Replication Events |
| AD database corrupted on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"],1800s)=0 and logeventid(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"],,"467")=1 | AVERAGE ⚠ | Directory Service Events |
| Directory Service Events Error on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"],1800s)=0 | - | Directory Service Events |
| It has been too long since {HOST.NAME} replicated | If a domain controller has not replicated with its partner for longer than a tombstone lifetime, it is possible that a lingering object problem exists on one or both domain controllers. The tombstone lifetime in an Active Directory forest determines how long a deleted object (called a "tombstone") is retained in Active Directory Domain Services (AD DS). The tombstone lifetime is determined by the value of the tombstoneLifetime attribute on the Directory Service object in the configuration directory partition. | logseverity(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"],1800s)=0 and logeventid(/AD DS Health and Performance/eventlog[Directory Service,,"Warning|Error|Critical"],,"2042")=1 | HIGH ⛔ | Directory Service Events |
| DNS Server Events Error on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[DNS Server,,"Warning|Error|Critical"])>1 and nodata(/AD DS Health and Performance/eventlog[DNS Server,,"Warning|Error|Critical"],1800s)=0 | - | DNS Server Events |
| The backup operation has failed on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"])>1 and nodata(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"],1800s)=0 and logeventid(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"],,"5")=1 | HIGH ⛔ | Microsoft-Windows-Backup Events |
| The backup operation has finished successfully on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"])>1 and nodata(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"],1800s)=0 and logeventid(/AD DS Health and Performance/eventlog[Microsoft-Windows-Backup,,"Information|Error"],,"4")=1 | INFO 🔔 | Microsoft-Windows-Backup Events |
| Netlogon Error on {HOST.NAME} | - | logseverity(/AD DS Health and Performance/eventlog[System,,"Error","NETLOGON",^5723$])>1 and nodata(/AD DS Health and Performance/eventlog[System,,"Error","NETLOGON",^5723$],1800s)=0 | - | NETLOGON Events |
| Active Directory Missing IP Range < - > Site Allocations | - | nodata(/AD DS Health and Performance/log[c:\windows\debug\netlogon.log,"NO_CLIENT_SITE",,,skip,,,],3600)=0 | WARNING 📢 | LogFile (Netlogon) |
| LDAP service is down on {HOST.NAME} | - | max(/AD DS Health and Performance/net.tcp.service[ldap],#3)=0 | AVERAGE ⚠ | LDAP Port is running |
| I/O Database Reads/sec > 10 on {HOST.NAME} | - | min(/AD DS Health and Performance/perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads/sec],5m)>10 | WARNING 📢 | I/O Database Reads/sec |
| I/O Database Reads Average Latency > 15ms on {HOST.NAME} | - | min(/AD DS Health and Performance/perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Database Reads Average Latency],5m)>15 | WARNING 📢 | I/O Database Reads Average Latency |
| I/O Log Writes Average Latency > 10ms on {HOST.NAME} | - | min(/AD DS Health and Performance/perf_counter_en[\Database ==> Instances(lsass/NTDSA)\I/O Log Writes Average Latency],5m)>10 | WARNING 📢 | I/O Log Writes Average Latency |
| DNS Total Query Received/sec >5000 | - | max(/AD DS Health and Performance/perf_counter_en[\DNS\Total Query Received/sec],#3)>5000 | WARNING 📢 | DNS Total Query Received/sec |
| DNS Total Query Received/sec is too high | - | max(/AD DS Health and Performance/perf_counter_en[\DNS\Total Query Received/sec],#3)>{$DNS.TOTAL.RECEIVED.MAX.WARN} | AVERAGE ⚠ | DNS Total Query Received/sec |
| Average Semaphore Hold Time > 1s on {HOST.NAME} | - | min(/AD DS Health and Performance/perf_counter_en[\Netlogon(_Total)\Average Semaphore Hold Time],15m)>1 | WARNING 📢 | Netlogon Average Semaphore Hold Time |
| "ADWS" (Active Directory Web Services) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[ADWS,state],#3)<>0 | AVERAGE ⚠ | State of service "ADWS" (Active Directory Web Services) |
| "DFSR" (DFS Replication) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[DFSR,state],#3)<>0 | AVERAGE ⚠ | State of service "DFSR" (DFS Replication) |
| "DNS" (DNS Server) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[DNS,state],#3)<>0 | AVERAGE ⚠ | State of service "DNS" (DNS Server) |
| "Dnscache" (DNS Client) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[Dnscache,state],#3)<>0 | AVERAGE ⚠ | State of service "Dnscache" (DNS Client) |
| "IsmServ" (Intersite Messaging) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[IsmServ,state],#3)<>0 | AVERAGE ⚠ | State of service "IsmServ" (Intersite Messaging) |
| "Kdc" (Kerberos Key Distribution Center) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[Kdc,state],#3)<>0 | AVERAGE ⚠ | State of service "Kdc" (Kerberos Key Distribution Center) |
| "LanmanServer" (Server) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[LanmanServer,state],#3)<>0 | AVERAGE ⚠ | State of service "LanmanServer" (Server) |
| "LanmanWorkstation" (Workstation) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[LanmanWorkstation,state],#3)<>0 | AVERAGE ⚠ | State of service "LanmanWorkstation" (Workstation) |
| "Netlogon" (Netlogon) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[Netlogon,state],#3)<>0 | AVERAGE ⚠ | State of service "Netlogon" (Netlogon) |
| "NTDS" (Active Directory Domain Services) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[NTDS,state],#3)<>0 | AVERAGE ⚠ | State of service "NTDS" (Active Directory Domain Services) |
| "RpcSs" (Remote Procedure Call (RPC)) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[RpcSs,state],#3)<>0 | AVERAGE ⚠ | State of service "RpcSs" (Remote Procedure Call (RPC)) |
| "SamSs" (Security Accounts Manager) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[SamSs,state],#3)<>0 | AVERAGE ⚠ | State of service "SamSs" (Security Accounts Manager) |
| "W32Time" (Windows Time) is not running | The service has a state other than "Running" for the last three times. | min(/AD DS Health and Performance/service.info[W32Time,state],#3)<>0 | AVERAGE ⚠ | State of service "W32Time" (Windows Time) |