Skip to content

Latest commit

 

History

History
186 lines (168 loc) · 49.7 KB

collector.mscluster.md

File metadata and controls

186 lines (168 loc) · 49.7 KB

mscluster_cluster collector

The MSCluster_Cluster class is a dynamic WMI class that represents a cluster.

Metric name prefix mscluster
Classes MSCluster_Cluster,MSCluster_Network,MSCluster_Node,MSCluster_Resource,MSCluster_ResourceGroup
Enabled by default? No

Flags

--collectors.mscluster.enabled

Comma-separated list of collectors to use, for example: --collectors.mscluster.enabled=cluster,network,node,resource,resouregroup. Matching is case-sensitive.

Metrics

Cluster

Name Description Type Labels
mscluster_cluster_AddEvictDelay Provides access to the cluster's AddEvictDelay property, which is the number a seconds that a new node is delayed after an eviction of another node. gauge name
mscluster_cluster_AdminAccessPoint The type of the cluster administrative access point. gauge name
mscluster_cluster_AutoAssignNodeSite Determines whether or not the cluster will attempt to automatically assign nodes to sites based on networks and Active Directory Site information. gauge name
mscluster_cluster_AutoBalancerLevel Determines the level of aggressiveness of AutoBalancer. gauge name
mscluster_cluster_AutoBalancerMode Determines whether or not the auto balancer is enabled. gauge name
mscluster_cluster_BackupInProgress Indicates whether a backup is in progress. gauge name
mscluster_cluster_BlockCacheSize CSV BlockCache Size in MB. gauge name
mscluster_cluster_ClusSvcHangTimeout Controls how long the cluster network driver waits between Failover Cluster Service heartbeats before it determines that the Failover Cluster Service has stopped responding. gauge name
mscluster_cluster_ClusSvcRegroupOpeningTimeout Controls how long a node will wait on other nodes in the opening stage before deciding that they failed. gauge name
mscluster_cluster_ClusSvcRegroupPruningTimeout Controls how long the membership leader will wait to reach full connectivity between cluster nodes. gauge name
mscluster_cluster_ClusSvcRegroupStageTimeout Controls how long a node will wait on other nodes in a membership stage before deciding that they failed. gauge name
mscluster_cluster_ClusSvcRegroupTickInMilliseconds Controls how frequently the membership algorithm is sending periodic membership messages. gauge name
mscluster_cluster_ClusterEnforcedAntiAffinity Enables or disables hard enforcement of group anti-affinity classes. gauge name
mscluster_cluster_ClusterFunctionalLevel The functional level the cluster is currently running in. gauge name
mscluster_cluster_ClusterGroupWaitDelay Maximum time in seconds that a group waits for its preferred node to come online during cluster startup before coming online on a different node. gauge name
mscluster_cluster_ClusterLogLevel Controls the level of cluster logging. gauge name
mscluster_cluster_ClusterLogSize Controls the maximum size of the cluster log files on each of the nodes. gauge name
mscluster_cluster_ClusterUpgradeVersion Specifies the upgrade version the cluster is currently running in. gauge name
mscluster_cluster_CrossSiteDelay Controls how long the cluster network driver waits in milliseconds between sending Cluster Service heartbeats across sites. gauge name
mscluster_cluster_CrossSiteThreshold Controls how many Cluster Service heartbeats can be missed across sites before it determines that Cluster Service has stopped responding. gauge name
mscluster_cluster_CrossSubnetDelay Controls how long the cluster network driver waits in milliseconds between sending Cluster Service heartbeats across subnets. gauge name
mscluster_cluster_CrossSubnetThreshold Controls how many Cluster Service heartbeats can be missed across subnets before it determines that Cluster Service has stopped responding. gauge name
mscluster_cluster_CsvBalancer Whether automatic balancing for CSV is enabled. gauge name
mscluster_cluster_DatabaseReadWriteMode Sets the database read and write mode. gauge name
mscluster_cluster_DefaultNetworkRole Provides access to the cluster's DefaultNetworkRole property. gauge name
mscluster_cluster_DetectedCloudPlatform gauge name
mscluster_cluster_DetectManagedEvents gauge name
mscluster_cluster_DetectManagedEventsThreshold gauge name
mscluster_cluster_DisableGroupPreferredOwnerRandomization gauge name
mscluster_cluster_DrainOnShutdown Whether to drain the node when cluster service is being stopped. gauge name
mscluster_cluster_DynamicQuorumEnabled Allows cluster service to adjust node weights as needed to increase availability. gauge name
mscluster_cluster_EnableSharedVolumes Enables or disables cluster shared volumes on this cluster. gauge name
mscluster_cluster_FixQuorum Provides access to the cluster's FixQuorum property, which specifies if the cluster is in a fix quorum state. gauge name
mscluster_cluster_GracePeriodEnabled Whether the node grace period feature of this cluster is enabled. gauge name
mscluster_cluster_GracePeriodTimeout The grace period timeout in milliseconds. gauge name
mscluster_cluster_GroupDependencyTimeout The timeout after which a group will be brought online despite unsatisfied dependencies gauge name
mscluster_cluster_HangRecoveryAction Controls the action to take if the user-mode processes have stopped responding. gauge name
mscluster_cluster_IgnorePersistentStateOnStartup Provides access to the cluster's IgnorePersistentStateOnStartup property, which specifies whether the cluster will bring online groups that were online when the cluster was shut down. gauge name
mscluster_cluster_LogResourceControls Controls the logging of resource controls. gauge name
mscluster_cluster_LowerQuorumPriorityNodeId Specifies the Node ID that has a lower priority when voting for quorum is performed. If the quorum vote is split 50/50%, the specified node's vote would be ignored to break the tie. If this is not set then the cluster will pick a node at random to break the tie. gauge name
mscluster_cluster_MaxNumberOfNodes Indicates the maximum number of nodes that may participate in the Cluster. gauge name
mscluster_cluster_MessageBufferLength The maximum unacknowledged message count for GEM. gauge name
mscluster_cluster_MinimumNeverPreemptPriority Groups with this priority or higher cannot be preempted. gauge name
mscluster_cluster_MinimumPreemptorPriority Minimum priority a cluster group must have to be able to preempt another group. gauge name
mscluster_cluster_NetftIPSecEnabled Whether IPSec is enabled for cluster internal traffic. gauge name
mscluster_cluster_PlacementOptions Various option flags to modify default placement behavior. gauge name
mscluster_cluster_PlumbAllCrossSubnetRoutes Plumbs all possible cross subnet routes to all nodes. gauge name
mscluster_cluster_PreventQuorum Whether the cluster will ignore group persistent state on startup. gauge name
mscluster_cluster_QuarantineDuration The quarantine period timeout in milliseconds. gauge name
mscluster_cluster_QuarantineThreshold Number of node failures before it will be quarantined. gauge name
mscluster_cluster_QuorumArbitrationTimeMax Controls the maximum time necessary to decide the Quorum owner node. gauge name
mscluster_cluster_QuorumArbitrationTimeMin Controls the minimum time necessary to decide the Quorum owner node. gauge name
mscluster_cluster_QuorumLogFileSize This property is obsolete. gauge name
mscluster_cluster_QuorumTypeValue Get the current quorum type value. -1: Unknown; 1: Node; 2: FileShareWitness; 3: Storage; 4: None gauge name
mscluster_cluster_RequestReplyTimeout Controls the request reply time-out period. gauge name
mscluster_cluster_ResiliencyDefaultPeriod The default resiliency period, in seconds, for the cluster. gauge name
mscluster_cluster_ResiliencyLevel The resiliency level for the cluster. gauge name
mscluster_cluster_ResourceDllDeadlockPeriod This property is obsolete. gauge name
mscluster_cluster_RootMemoryReserved Controls the amount of memory reserved for the parent partition on all cluster nodes. gauge name
mscluster_cluster_RouteHistoryLength The history length for routes to help finding network issues. gauge name
mscluster_cluster_S2DBusTypes Bus types for storage spaces direct. gauge name
mscluster_cluster_S2DCacheDesiredState Desired state of the storage spaces direct cache. gauge name
mscluster_cluster_S2DCacheFlashReservePercent Percentage of allocated flash space to utilize when caching. gauge name
mscluster_cluster_S2DCachePageSizeKBytes Page size in KB used by S2D cache. gauge name
mscluster_cluster_S2DEnabled Whether direct attached storage (DAS) is enabled. gauge name
mscluster_cluster_S2DIOLatencyThreshold The I/O latency threshold for storage spaces direct. gauge name
mscluster_cluster_S2DOptimizations Optimization flags for storage spaces direct. gauge name
mscluster_cluster_SameSubnetDelay Controls how long the cluster network driver waits in milliseconds between sending Cluster Service heartbeats on the same subnet. gauge name
mscluster_cluster_SameSubnetThreshold Controls how many Cluster Service heartbeats can be missed on the same subnet before it determines that Cluster Service has stopped responding. gauge name
mscluster_cluster_SecurityLevel Controls the level of security that should apply to intracluster messages. 0: Clear Text; 1: Sign; 2: Encrypt gauge name
mscluster_cluster_SecurityLevelForStorage gauge name
mscluster_cluster_SharedVolumeVssWriterOperationTimeout CSV VSS Writer operation timeout in seconds. gauge name
mscluster_cluster_ShutdownTimeoutInMinutes The maximum time in minutes allowed for cluster resources to come offline during cluster service shutdown. gauge name
mscluster_cluster_UseClientAccessNetworksForSharedVolumes Whether the use of client access networks for cluster shared volumes feature of this cluster is enabled. 0: Disabled; 1: Enabled; 2: Auto gauge name
mscluster_cluster_WitnessDatabaseWriteTimeout Controls the maximum time in seconds that a cluster database write to a witness can take before the write is abandoned. gauge name
mscluster_cluster_WitnessDynamicWeight The weight of the configured witness. gauge name
mscluster_cluster_WitnessRestartInterval Controls the witness restart interval. gauge name

Network

Name Description Type Labels
mscluster_network_Characteristics Provides the characteristics of the network. The cluster defines characteristics only for resources. For a description of these characteristics, see CLUSCTL_RESOURCE_GET_CHARACTERISTICS. gauge name
mscluster_network_Flags Provides access to the flags set for the network. The cluster defines flags only for resources. For a description of these flags, see CLUSCTL_RESOURCE_GET_FLAGS. gauge name
mscluster_network_Metric The metric of a cluster network (networks with lower values are used first). If this value is set, then the AutoMetric property is set to false. gauge name
mscluster_network_Role Provides access to the network's Role property. The Role property describes the role of the network in the cluster. 0: None; 1: Cluster; 2: Client; 3: Both gauge name
mscluster_network_State Provides the current state of the network. 1-1: Unknown; 0: Unavailable; 1: Down; 2: Partitioned; 3: Up gauge name

Network

Name Description Type Labels
mscluster_node_BuildNumber Provides access to the node's BuildNumber property. gauge name
mscluster_node_Characteristics Provides access to the characteristics set for the node. For a list of possible characteristics, see CLUSCTL_NODE_GET_CHARACTERISTICS. gauge name
mscluster_node_DetectedCloudPlatform The dynamic vote weight of the node adjusted by dynamic quorum feature. gauge name
mscluster_node_DynamicWeight The dynamic vote weight of the node adjusted by dynamic quorum feature. gauge name
mscluster_node_Flags Provides access to the flags set for the node. For a list of possible characteristics, see CLUSCTL_NODE_GET_FLAGS. gauge name
mscluster_node_MajorVersion Provides access to the node's MajorVersion property, which specifies the major portion of the Windows version installed. gauge name
mscluster_node_MinorVersion Provides access to the node's MinorVersion property, which specifies the minor portion of the Windows version installed. gauge name
mscluster_node_NeedsPreventQuorum Whether the cluster service on that node should be started with prevent quorum flag. gauge name
mscluster_node_NodeDrainStatus The current node drain status of a node. 0: Not Initiated; 1: In Progress; 2: Completed; 3: Failed gauge name
mscluster_node_NodeHighestVersion Provides access to the node's NodeHighestVersion property, which specifies the highest possible version of the cluster service with which the node can join or communicate. gauge name
mscluster_node_NodeLowestVersion Provides access to the node's NodeLowestVersion property, which specifies the lowest possible version of the cluster service with which the node can join or communicate. gauge name
mscluster_node_NodeWeight The vote weight of the node. gauge name
mscluster_node_State Returns the current state of a node. -1: Unknown; 0: Up; 1: Down; 2: Paused; 3: Joining gauge name
mscluster_node_StatusInformation The isolation or quarantine status of the node. gauge name

Resource

Name Description Type Labels
mscluster_resource_Characteristics Provides the characteristics of the object. The cluster defines characteristics only for resources. For a description of these characteristics, see CLUSCTL_RESOURCE_GET_CHARACTERISTICS. gauge type, owner_group, name
mscluster_resource_DeadlockTimeout Indicates the length of time to wait, in milliseconds, before declaring a deadlock in any call into a resource. gauge type, owner_group, name
mscluster_resource_EmbeddedFailureAction The time, in milliseconds, that a resource should remain in a failed state before the Cluster service attempts to restart it. gauge type, owner_group, name
mscluster_resource_Flags Provides access to the flags set for the object. The cluster defines flags only for resources. For a description of these flags, see CLUSCTL_RESOURCE_GET_FLAGS. gauge type, owner_group, name
mscluster_resource_IsAlivePollInterval Provides access to the resource's IsAlivePollInterval property, which is the recommended interval in milliseconds at which the Cluster Service should poll the resource to determine whether it is operational. If the property is set to 0xFFFFFFFF, the Cluster Service uses the IsAlivePollInterval property for the resource type associated with the resource. gauge type, owner_group, name
mscluster_resource_LooksAlivePollInterval Provides access to the resource's LooksAlivePollInterval property, which is the recommended interval in milliseconds at which the Cluster Service should poll the resource to determine whether it appears operational. If the property is set to 0xFFFFFFFF, the Cluster Service uses the LooksAlivePollInterval property for the resource type associated with the resource. gauge type, owner_group, name
mscluster_resource_MonitorProcessId Provides the process ID of the resource host service that is currently hosting the resource. gauge type, owner_group, name
mscluster_resource_OwnerNode The node hosting the resource. gauge type, owner_group, node_name, name
mscluster_resource_PendingTimeout Provides access to the resource's PendingTimeout property. If a resource cannot be brought online or taken offline in the number of milliseconds specified by the PendingTimeout property, the resource is forcibly terminated. gauge type, owner_group, name
mscluster_resource_ResourceClass Gets or sets the resource class of a resource. 0: Unknown; 1: Storage; 2: Network; 32768: Unknown gauge type, owner_group, name
mscluster_resource_RestartAction Provides access to the resource's RestartAction property, which is the action to be taken by the Cluster Service if the resource fails. gauge type, owner_group, name
mscluster_resource_RestartDelay Indicates the time delay before a failed resource is restarted. gauge type, owner_group, name
mscluster_resource_RestartPeriod Provides access to the resource's RestartPeriod property, which is interval of time, in milliseconds, during which a specified number of restart attempts can be made on a nonresponsive resource. gauge type, owner_group, name
mscluster_resource_RestartThreshold Provides access to the resource's RestartThreshold property which is the maximum number of restart attempts that can be made on a resource within an interval defined by the RestartPeriod property before the Cluster Service initiates the action specified by the RestartAction property. gauge type, owner_group, name
mscluster_resource_RetryPeriodOnFailure Provides access to the resource's RetryPeriodOnFailure property, which is the interval of time (in milliseconds) that a resource should remain in a failed state before the Cluster service attempts to restart it. gauge type, owner_group, name
mscluster_resource_State The current state of the resource. -1: Unknown; 0: Inherited; 1: Initializing; 2: Online; 3: Offline; 4: Failed; 128: Pending; 129: Online Pending; 130: Offline Pending gauge type, owner_group, name
mscluster_resource_Subclass Provides the list of references to nodes that can be the owner of this resource. gauge type, owner_group, name

ResourceGroup

Name Description Type Labels
mscluster_resourcegroup_AutoFailbackType Provides access to the group's AutoFailbackType property. gauge name
mscluster_resourcegroup_Characteristics Provides the characteristics of the group. The cluster defines characteristics only for resources. For a description of these characteristics, see CLUSCTL_RESOURCE_GET_CHARACTERISTICS. gauge name
mscluster_resourcegroup_ColdStartSetting Indicates whether a group can start after a cluster cold start. gauge name
mscluster_resourcegroup_DefaultOwner Number of the last node the resource group was activated on or explicitly moved to. gauge name
mscluster_resourcegroup_FailbackWindowEnd The FailbackWindowEnd property provides the latest time that the group can be moved back to the node identified as its preferred node. gauge name
mscluster_resourcegroup_FailbackWindowStart The FailbackWindowStart property provides the earliest time (that is, local time as kept by the cluster) that the group can be moved back to the node identified as its preferred node. gauge name
mscluster_resourcegroup_FailoverPeriod The FailoverPeriod property specifies a number of hours during which a maximum number of failover attempts, specified by the FailoverThreshold property, can occur. gauge name
mscluster_resourcegroup_FailoverThreshold The FailoverThreshold property specifies the maximum number of failover attempts. gauge name
mscluster_resourcegroup_Flags Provides access to the flags set for the group. The cluster defines flags only for resources. For a description of these flags, see CLUSCTL_RESOURCE_GET_FLAGS. gauge name
mscluster_resourcegroup_GroupType The Type of the resource group. gauge name
mscluster_resourcegroup_OwnerNode The node hosting the resource group. gauge node_name, name
mscluster_resourcegroup_Priority Priority value of the resource group gauge name
mscluster_resourcegroup_ResiliencyPeriod The resiliency period for this group, in seconds. gauge name
mscluster_resourcegroup_State The current state of the resource group. -1: Unknown; 0: Online; 1: Offline; 2: Failed; 3: Partial Online; 4: Pending gauge name
mscluster_resourcegroup_UpdateDomain gauge name

Example metric

Query the state of all cluster resource owned by node1

windows_mscluster_resource_owner_node{node_name="node1"}

Useful queries

Counts the number of Network Name cluster resource

count(windows_mscluster_resource_state{type="Network Name"})

Alerting examples

This collector does not yet have alerting examples, we would appreciate your help adding them!