draft-kaestle-monitoring-plugins-interface-03.txt | draft-kaestle-monitoring-plugins-interface-04.txt | |||
---|---|---|---|---|
Network Working Group L. Kästle | Network Working Group L. Kästle | |||
Internet-Draft The Monitoring Plugins Project | Internet-Draft The Monitoring Plugins Project | |||
Intended status: Informational 24 September 2023 | Intended status: Informational 22 March 2024 | |||
Expires: 27 March 2024 | Expires: 23 September 2024 | |||
The Monitoring Plugins Interface | The Monitoring Plugins Interface | |||
draft-kaestle-monitoring-plugins-interface-03 | draft-kaestle-monitoring-plugins-interface-04 | |||
Abstract | Abstract | |||
This document aims to document the Monitoring Plugin Interface, a | This document aims to document the Monitoring Plugin Interface, a | |||
standard more or less strictly implemented by different network | standard more or less strictly implemented by different network | |||
monitoring solutions. Implementers and Users of network monitoring | monitoring solutions. Implementers and Users of network monitoring | |||
solutions, monitoring plugins and libraries can use this as a | solutions, monitoring plugins and libraries can use this as a | |||
reference point as to how these programs interface with each other. | reference point as to how these programs interface with each other. | |||
About This Document | About This Document | |||
skipping to change at page 1, line 45 ¶ | skipping to change at page 1, line 45 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 27 March 2024. | This Internet-Draft will expire on 23 September 2024. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2023 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
and restrictions with respect to this document. | and restrictions with respect to this document. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 3 | 1.1. Wording, Context and Scope . . . . . . . . . . . . . . . 4 | |||
2.1. Range expressions . . . . . . . . . . . . . . . . . . . . 3 | 1.1.1. Wording . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
2.1.1. Examples . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1.2. Scope . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
3. The basic Monitoring Plugin usage . . . . . . . . . . . . . . 4 | 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 6 | |||
4. Input Parameters for a Monitoring Plugin . . . . . . . . . . 4 | 2.1. Range expressions . . . . . . . . . . . . . . . . . . . . 6 | |||
4.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . 9 | 2.1.1. Examples . . . . . . . . . . . . . . . . . . . . . . 8 | |||
5. Output of a Monitoring Plugin . . . . . . . . . . . . . . . . 10 | 3. Input Parameters for a Monitoring Plugin . . . . . . . . . . 8 | |||
5.1. Exit Code . . . . . . . . . . . . . . . . . . . . . . . . 10 | 3.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
5.2. Textual Output . . . . . . . . . . . . . . . . . . . . . 12 | 4. Output of a Monitoring Plugin . . . . . . . . . . . . . . . . 14 | |||
5.2.1. Human readable output . . . . . . . . . . . . . . . . 12 | 4.1. Exit Code . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
5.2.2. Performance data . . . . . . . . . . . . . . . . . . 13 | 4.2. Textual Output . . . . . . . . . . . . . . . . . . . . . 16 | |||
6. Implementation Status . . . . . . . . . . . . . . . . . . . . 14 | 4.2.1. Free form output . . . . . . . . . . . . . . . . . . 17 | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | 4.2.2. Performance data . . . . . . . . . . . . . . . . . . 18 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 | 5. Implementation Status . . . . . . . . . . . . . . . . . . . . 19 | |||
9. Normative References . . . . . . . . . . . . . . . . . . . . 15 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 | |||
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 | 8. Normative References . . . . . . . . . . . . . . . . . . . . 20 | |||
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
1. Introduction | 1. Introduction | |||
With the emergence of NetSaint/Nagios at the latest, these system and | Maintaining computer networks and providing services to machines, | |||
their successors/forks have relied on a loose group of programs | networks and humans is a complex task. Building infrastructures from | |||
the start commonly demands huge know-how in different technologies, | ||||
but maintaining them afterwards is in many cases often more | ||||
demanding. Complex system can fail in a multitude of different way, | ||||
where the original problem might lead to symptoms which are often not | ||||
immediately obvious and likely occur in a different place relative to | ||||
the problem. | ||||
To ensure continuous and reliable operation the state and the | ||||
functionality has to monitored permanently with appropriate tools. | ||||
These monitoring tools allow an operator, often a system and/or | ||||
network administrator to read the state of the whole system (or | ||||
specific subsystems) and detect problems when they occur. | ||||
The purpose of monitoring tools is therefore to determine the state | ||||
of the systems in question and detect possible anomalies or problems, | ||||
measuring performance related metrics, process that data, produce a | ||||
human tangible representation and take action according to certain | ||||
rules, notify a system administrator for example. A system | ||||
implementing most or all of those tasks is called a "monitoring | ||||
system" in the following. | ||||
With the emergence of NetSaint/Nagios at the latest, a group of | ||||
network monitoring systems have relied on a loose group of programs | ||||
called "Monitoring Plugins" to do the lower level task of actually | called "Monitoring Plugins" to do the lower level task of actually | |||
determining the state of a particular entity or conduct measurements | determining the state of a particular entity or conduct measurements | |||
of certain values. | of certain values. | |||
This document shall help users and especially developers of those | The same interface was implemented by several different monitoring | |||
programs as a basis on how they should be implemented, how they | solutions, including, without claiming completeness, Icinga, Icinga2, | |||
should work and how they should behave. It encourages the | Shinken, Naemon, Centreon and Opsview. | |||
standardization of libraries, Monitoring Plugins and Monitoring | ||||
Systems, to reduce the cognitive load on users, administrators and | On the other side of this interface there are hundreds of individual | |||
developers, if they work with different implementations. | plugins developed by different people in different languages for a | |||
multitude of purposes. | ||||
Examples for these are: | ||||
* The monitoring plugins https://www.monitoring-plugins.org/ | ||||
* The nagios plugins https://nagios-plugins.org/ | ||||
* Several monitoring plugins by Consol https://labs.consol.de/de/ | ||||
nagios/ | ||||
* Several monitoring plugins by Linuxfabrik | ||||
https://github.com/Linuxfabrik/monitoring-plugins | ||||
* Several monitoring plugins by Davide Madrisan | ||||
https://github.com/Linuxfabrik/monitoring-plugins | ||||
This document shall serve administrators of those monitoring systems | ||||
and especially developers of these monitoring plugins and monitoring | ||||
systems as a basis on how this interface should implemented, how the | ||||
plugins should work and how they should behave. It encourages the | ||||
standardization of libraries, monitoring plugins and monitoring | ||||
systems, to reduce the cognitive load on administrators and | ||||
developers, when they work with different implementations. | ||||
+--------------------+ | ||||
| Visualisation tool +------------+ | ||||
+--------------------+ | | ||||
| | ||||
+-----------------+ exec +-------------------+ | ||||
| Monitoring tool +--------+ Monitoring Plugin | | ||||
+-----------------+ +-------------------+ | ||||
+--------------------+ | | ||||
| Notification tool +------------+ | ||||
+--------------------+ | ||||
This document aims to be as general as possible and not to assume a | This document aims to be as general as possible and not to assume a | |||
special implementation detail, e.g. the programming language, the | special implementation detail, e.g. the programming language, the | |||
install mechanism or the monitoring system which executes the | install mechanism or the monitoring system which executes the | |||
Monitoring Plugin. | monitoring plugin. | |||
1.1. Wording, Context and Scope | ||||
1.1.1. Wording | ||||
1.1.1.1. Monitoring system | ||||
A _monitoring system_ is a collection of software components which | ||||
serve the purpose of providing the system administrator of a | ||||
particular system with an overview of the whole system. This ideally | ||||
includes all of the devices, machines and components and their state | ||||
as well as insights on particular components. | ||||
Most of the system mentioned here (for example Icinga, Naemon and | ||||
Nagios) also provide a functionality to send notifications to the | ||||
system administrator when something goes wrong, e.g. a particular | ||||
component does not respond anymore or a certain threshold is | ||||
exceeded. | ||||
For the purpose of this document a monitoring system is just "the | ||||
thing that executes a monitoring plugin". | ||||
1.1.1.2. Monitoring plugin | ||||
A monitoring plugin is a standalone executable, which is executed by | ||||
the monitoring systems to conduct one or multiple tests on behalf of | ||||
the monitoring system. | ||||
The monitoring plugin does not rely on functionality provided by the | ||||
monitoring system and is not a builtin of the monitoring system or | ||||
linked against certain components of the monitoring system. | ||||
Therefore it can also be executed manually and independently of a | ||||
particular monitoring system. | ||||
The monitoring plugin can therefor be implemented independently of | ||||
the monitoring system, it does not share necessarily share | ||||
dependencies, the programming language or the distribution mechanism | ||||
or other components with the monitoring system. | ||||
The monitoring plugin MAY accept parameters in the form of command | ||||
line arguments, environment variables or configuration files (the | ||||
location of which MAY in turn be given on the command line or via | ||||
environment variable). | ||||
The monitoring plugin then proceeds to execute its duty and returns | ||||
the result to the Monitoring System. Part of the process of | ||||
returning the result is the termination of the execution of the | ||||
Monitoring Plugin itself. | ||||
The execution of a monitoring plugin is typically short lived (in the | ||||
order of seconds or milliseconds) and, while some implementations | ||||
store some state on non-volatile memory to be able to reason over | ||||
several execution cycles, this is not considered best practice. A | ||||
monitoring plugin can not depend on any information other that which | ||||
was given by the calling monitoring system, since the monitoring | ||||
system might execute the plugin from a different system in the next | ||||
cycle or switch between multiple systems. | ||||
A reasonable approach to thinking about monitoring plugins is to | ||||
picture a "snapshot" of a current state, each execution independent | ||||
from the others and wholly dependent on the input parameters and the | ||||
"thing" that is to be monitored. | ||||
This "thing" which is to be monitored is not, in principle, | ||||
restricted to any specific aspect of IT systems, apart from the | ||||
restrictions above and the general concept. Examples for areas which | ||||
are difficult to cover whith this approach, are statistical analyses | ||||
of time series data or event monitoring, such as log monitoring. | ||||
However querying system, which are collecting and processing this | ||||
kind of data, would be a valid indirection. | ||||
A popular example for this behaviour is the extraction of bandwith | ||||
usage on most switches. Commonly this is only exposed as counters | ||||
for each network interface, one for outgoing and one for incoming | ||||
bytes. The absolute value of bytes is practically useless without | ||||
knowing the value which was read previously and the time difference | ||||
between the probes. In this scenario, different workaround are | ||||
possible, if the device itself does not provide the rate values: | ||||
* The monitoring plugin queries the values several times during its | ||||
execution cycle with a know time difference between the queries, | ||||
which allows rate calculation in this (typically) short time | ||||
frame. Nothing is know effectively about the time between | ||||
execution cycles. | ||||
* The monitoring system executes the monitoring plugin with the data | ||||
and date of the last execution as parameter, which allows for | ||||
proper rate calculation. | ||||
* A system queries the devices regularly for the absolute values and | ||||
stores them. The monitoring plugin then queries this system for | ||||
the collected values (and timestamps) or directly for a | ||||
statistical analysis | ||||
1.1.2. Scope | ||||
The scope of this document is limited to the interaction of a | ||||
monitoring system and a monitoring plugin, meaning the interface | ||||
connecting them. | ||||
It does not attempt to describe the inner workings of a specific | ||||
implementation of either monitoring system or monitoring plugin. | ||||
2. Conventions and Definitions | 2. Conventions and Definitions | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in | "OPTIONAL" in this document are to be interpreted as described in | |||
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
2.1. Range expressions | 2.1. Range expressions | |||
In many cases thresholds for metrics mark a certain range of values | In many cases thresholds for metrics mark a certain range of values | |||
where the values is considered to be good or bad if it is inside or | where the values are considered to be good or bad depending on | |||
outside. While for significant number of metrics a upper (e.g. load | whether they are inside or outside of this range. For a significant | |||
on unixoid systems) or lower (e.g. effective throughput, free space | number of metrics an upper (e.g. load on unixoid systems) or lower | |||
in memory or storage) border might suffice, for some it does not, for | (e.g. effective throughput, free space in memory or storage) border | |||
example a temperature value from a temperature sensor should be | might suffice, for some it does not, for example a temperature value | |||
within certain range (e.g. between 10℃ and 45℃). | from a temperature sensor should be within certain range (e.g. | |||
between 10℃ and 45℃). | ||||
Regarding input parameters this might be handled with options like -- | Regarding input parameters this might be handled with options like -- | |||
critical-upper-temperature and --critical-lower-temperature, but this | critical-upper-temperature and --critical-lower-temperature, but this | |||
creates a problem in the performance data output, if only scalar | creates a problem in the performance data output, if only scalar | |||
values could be used. To resolve this situation the _Range | values could be used. To resolve this situation the _Range | |||
expression_ format was introduced, with the following definition: | expression_ format was introduced, with the following definition: | |||
[@][start:][end] | range-expression = [direction-switch] bounds | |||
bounds = (lower-bound / upper-bound) / lower-bound upper-bound | ||||
direction-switch = "@" | ||||
lower-bound = NUMERAL ":" | ||||
upper-bound = NUMERAL | ||||
NUMERAL = ["-"] 1*DIGIT [ "." 1*DIGIT ] ; numerical value, either integer or floating point | ||||
where: | where: | |||
1. At least start or end MUST be provided. | 1. At least start or end MUST be provided. | |||
2. start <= end | 2. start <= end | |||
3. If start == 0, then start can be omitted. | 3. If start == 0, then start can be omitted. | |||
4. If end is omitted, it has the "value" of positive infinity. | 4. If end is omitted, it has the "value" of positive infinity. | |||
skipping to change at page 4, line 29 ¶ | skipping to change at page 8, line 25 ¶ | |||
+------------------+-----------------------------------------------+ | +------------------+-----------------------------------------------+ | |||
| 10:20 | < 10 or > 20, (outside the range of {10 .. | | | 10:20 | < 10 or > 20, (outside the range of {10 .. | | |||
| | 20}) | | | | 20}) | | |||
+------------------+-----------------------------------------------+ | +------------------+-----------------------------------------------+ | |||
| @10:20 | ≥ 10 and ≤ 20, (inside the range of {10 .. | | | @10:20 | ≥ 10 and ≤ 20, (inside the range of {10 .. | | |||
| | 20}) | | | | 20}) | | |||
+------------------+-----------------------------------------------+ | +------------------+-----------------------------------------------+ | |||
Table 1 | Table 1 | |||
3. The basic Monitoring Plugin usage | 3. Input Parameters for a Monitoring Plugin | |||
A Monitoring System executes a Monitoring Plugin. The Monitoring | ||||
Plugin MAY accept parameters in the form of command line arguments, | ||||
environment variables or a configuration file (the location of which | ||||
MAY in turn be given on the command line or via environment | ||||
variable). The Monitoring Plugin then proceeds to execute its duty | ||||
and returns the result to the Monitoring System. Part of the process | ||||
of returning the result is the termination of the execution of the | ||||
Monitoring Plugin itself. | ||||
4. Input Parameters for a Monitoring Plugin | ||||
A Monitoring Plugin MUST expect input parameters as arguments during | A Monitoring Plugin MUST expect input parameters as arguments during | |||
execution, if any are needed/expected at all. It MAY accept these | execution, if any are needed/expected at all. It MAY accept these | |||
parameters given as _environment variables_ and it MAY accept them in | parameters given as _environment variables_ and it MAY accept them in | |||
a configuration file (with a default path or a path given via | a configuration file (with a default path or a path given via | |||
arguments or _environment variables_). | arguments or _environment variables_). | |||
In general positional arguments are strongly discouraged. | In general positional arguments are strongly discouraged. | |||
Some arguments MUST have this predetermined meaning, if they are | Some arguments MUST have this predetermined meaning, if they are | |||
skipping to change at page 9, line 5 ¶ | skipping to change at page 12, line 35 ¶ | |||
| | |plugins, but | | | | | | | |plugins, but | | | | | |||
| | |might be used | | | | | | | |might be used | | | | | |||
| | |to always | | | | | | | |to always | | | | | |||
| | |ignore errors | | | | | | | |ignore errors | | | | | |||
| | |(e.g. to just | | | | | | | |(e.g. to just | | | | | |||
| | |collect data). | | | | | | | |collect data). | | | | | |||
+----------+---------+---------------+-------------+--------+-----------+ | +----------+---------+---------------+-------------+--------+-----------+ | |||
Table 2 | Table 2 | |||
4.1. Examples | 3.1. Examples | |||
For the execution with --help: | For the execution with --help: | |||
$ my_check_plugin --help | $ my_check_plugin --help | |||
the output might look like this: | the output might look like this: | |||
my_check_plugin version 3.1.4 | my_check_plugin version 3.1.4 | |||
Licensed under the AGPLv1. | Licensed under the AGPLv1. | |||
Repository: git.example.com/jdoe/my_check_plugin | Repository: git.example.com/jdoe/my_check_plugin | |||
skipping to change at page 10, line 4 ¶ | skipping to change at page 13, line 44 ¶ | |||
purpose, lists the options in an easily readable way and even gives | purpose, lists the options in an easily readable way and even gives | |||
some examples. | some examples. | |||
For the execution with --version | For the execution with --version | |||
$ my_check_plugin --version | $ my_check_plugin --version | |||
the output might be a bit shorter: | the output might be a bit shorter: | |||
my_check_plugin version 3.1.4 | my_check_plugin version 3.1.4 | |||
or even: | or even: | |||
3.1.4 | 3.1.4 | |||
where both show the necessary information. | where both show the necessary information. | |||
5. Output of a Monitoring Plugin | 4. Output of a Monitoring Plugin | |||
The output of a Monitoring Plugin consists of two parts on the first | The output of a Monitoring Plugin consists of two parts on the first | |||
level, the _Exit Code_ and output in textual form on _stdout_. | level, the _Exit Code_ and output in textual form on _stdout_. | |||
5.1. Exit Code | 4.1. Exit Code | |||
The Monitoring Plugin MUST make use of the _Exit Code_ as a method to | The Monitoring Plugin MUST make use of the _Exit Code_ as a method to | |||
communicate a result to the Monitoring System. Since the _Exit Code_ | communicate a result to the Monitoring System. Since the _Exit Code_ | |||
is more or less standardized over different systems as an integer | is more or less standardized over different systems as an integer | |||
number with a width of or greater than 8bit, the following mapping is | number with a width of or greater than 8bit, the following mapping is | |||
used: | used: | |||
+=============+==========+=========================================+ | +=============+==========+=========================================+ | |||
| Exit Code | Meaning | Meaning (extended) | | | Exit Code | Meaning | Meaning (extended) | | |||
| (numerical) | (short) | | | | (numerical) | (short) | | | |||
skipping to change at page 12, line 5 ¶ | skipping to change at page 16, line 5 ¶ | |||
| | | reliable statement about it. | | | | | reliable statement about it. | | |||
+-------------+----------+-----------------------------------------+ | +-------------+----------+-----------------------------------------+ | |||
| 4-125 | reserved | | | | 4-125 | reserved | | | |||
| | for | | | | | for | | | |||
| | future | | | | | future | | | |||
| | use | | | | | use | | | |||
+-------------+----------+-----------------------------------------+ | +-------------+----------+-----------------------------------------+ | |||
Table 3 | Table 3 | |||
5.2. Textual Output | 4.2. Textual Output | |||
The original purpose of the output on _stdout_ was to provide human | The textual output should consist of printable characters end of this | |||
readable information for the user of the Monitoring System, a way for | output is marked by EOF. There is no length limitation per se, but | |||
the Monitoring Plugin to communicate further details on what | it a limit of 512kiB would be reasonable and should not be exceeded | |||
happened. This purpose still exists, but was expanded with the, so | to avoid influencing the performance of the monitoring system, also | |||
called, performance data to allow the machine readable communication | some system might limit the output arbitrarily. | |||
of measured values for further processing in the Monitoring System, | ||||
e.g. for the creation of diagrams. | ||||
Therefore the further explanation is split into _human readable | The original purpose of the output on _stdout_ was to provide | |||
output_ and _performance data_. | information for the user of the Monitoring System in a free text | |||
form; a way for the Monitoring Plugin to communicate further details | ||||
on what happened and what the current state is. This purpose still | ||||
exists, but was expanded with the, so called, performance data to | ||||
allow the machine readable communication of measured values for | ||||
further processing in the Monitoring System, e.g. for the creation of | ||||
diagrams. | ||||
5.2.1. Human readable output | Therefore the further explanation is split into _free form output_ | |||
and _performance data_. The general schema is the following: | ||||
output = free-form-part *1( separator performance-data ) | ||||
separator = "|" | ||||
unicode-without-separator = %x0A-7B / %x7D-10FFFF ; UTF-8 encoded unicode code point without "|" | ||||
free-form-part = *unicode-without-separator | ||||
labelchar = %x0A-1f / %x21-10FFFF ; UTF-8 encoded unicode code point without " " | ||||
labelstring-with-space = labelchar ( " " / labelchar ) labelchar ; no spaces at beginning or end | ||||
label = 1*labelchar / "'" labelstring-with-space "'" ; if the label contains spaces, surround it with ' | ||||
UOM = 1*CHAR ; unit of measurement, a common unit specifier like "B" for Bytes or "s" for seconds | ||||
warning-value = range-expression | ||||
critical-value = range-expression | ||||
min-value = NUMERAL | ||||
max-value = NUMERAL | ||||
performance-data-value = label "=" NUMERAL *1( UOM ) *1( ";" *1warning_value *1( ";" *1critical-value *1( ";" *1min-value *1( ";" max-value )))) | ||||
performance-data = *performance-data-value *( " " performance-data-value ) | ||||
4.2.1. Free form output | ||||
This part of the output should give an user information about the | This part of the output should give an user information about the | |||
state of the test and, in the case of problems, ideally hint what the | state of the test and, in the case of problems, ideally hint what the | |||
origin of the problem might be or what the symptoms are. If the test | origin of the problem might be or what the symptoms are. If the test | |||
relies on numeric values, this might be displayed to give an user | relies on numeric values, this might be displayed to give an user | |||
more information about the specific problem. It might consist of one | more information about the specific problem. It might consist of one | |||
or more lines of printable symbols. | or more lines (separated by CRLF or LF) of unicode symbols. | |||
Considering the age and implementation of current systems, | ||||
restricting the output to US-ASCII characters is a safe choice. | ||||
Although no strict guidelines for creating this part of the output | Although no strict guidelines for creating this part of the output | |||
can really be given, a developer should keep a potential user in | can really be given due to its free form character, a developer | |||
mind. It might, for example, be OK to put the output in a single | should keep a potential reader in mind. It might, for example, be OK | |||
line if there are only one or two items of a similar type (think: | to put the output in a single line if there are only one or two items | |||
multiple file systems, multiple sensors, etc.) are present, but not | of a similar type (think: multiple file systems, multiple sensors, | |||
if there 10 or 100, although this might present a valid use case. If | etc.) are present, but not if there 10 or 100, although this might | |||
there are several different items exists in the output of the | present a valid use case. If there are several different items | |||
Monitoring Plugin they probably SHOULD be given their own line in the | exists in the output of the Monitoring Plugin they probably SHOULD be | |||
output. | given their own line in the output. | |||
5.2.1.1. Examples | The free form part is not intented for deep diagnostics, logging or | |||
too detailed reports, therefore it should be kept rather short. | ||||
4.2.1.1. Examples | ||||
Remaining space on filesystem "/" is OK | Remaining space on filesystem "/" is OK | |||
Sensor temperature is within thresholds | Sensor temperature is within thresholds | |||
Available Memory is too low | Available Memory is too low | |||
Sensore temperature exceeds thresholds | Sensore temperature exceeds thresholds | |||
are OK, but | are OK, but | |||
skipping to change at page 13, line 4 ¶ | skipping to change at page 17, line 39 ¶ | |||
Remaining space on filesystem "/" is OK | Remaining space on filesystem "/" is OK | |||
Sensor temperature is within thresholds | Sensor temperature is within thresholds | |||
Available Memory is too low | Available Memory is too low | |||
Sensore temperature exceeds thresholds | Sensore temperature exceeds thresholds | |||
are OK, but | are OK, but | |||
Remaining space on filesystem "/" is OK ( 62GiB / 128GiB ) | Remaining space on filesystem "/" is OK ( 62GiB / 128GiB ) | |||
Sensor temperature is within thresholds ( 42°C ) | Sensor temperature is within thresholds ( 42°C ) | |||
Available Memory is too low ( 126MiB / 32GiB ) | Available Memory is too low ( 126MiB / 32GiB ) | |||
Sensor temperature exceeds thresholds ( 78°C > 70°C ) | Sensor temperature exceeds thresholds ( 78°C > 70°C ) | |||
are better. | are better. | |||
5.2.2. Performance data | 4.2.2. Performance data | |||
In addition to the human readable part the output can contain machine | ||||
readable measurement values. These data points are separated from | ||||
the human readable part by the "|" symbol which is in effect until | ||||
the end of the output. The performance data then MUST consist of | ||||
space (ASCII 0x20) separated single values, these MUST have the | ||||
following format: | ||||
[']label[']=value[UOM][;warn[;crit[;min[;max]]]] | In addition to the free form part the output can contain machine | |||
readable measurement values. | ||||
with the following definitions: | In addition to the format definition earlier, the following contains | |||
some constaints and best practices: | ||||
1. label MUST consist of at least on non-space character, but can | 1. label MUST consist of at least one non-space character, but can | |||
otherwise contain any printable characters except for the equals | otherwise contain any printable characters except for the equals | |||
sign (=) or single quotes ('). If it contains spaces, it must be | sign (=) or single quotes ('). If it contains spaces, it must be | |||
surrounded by single quotes | surrounded by single quotes | |||
2. value is a numerical value, might be either an integer or a | 2. value is a numerical value, might be either an integer or a | |||
floating point number. Using floating point numbers if the value | floating point number. Using floating point numbers if the value | |||
is really discreet SHOULD be avoided. The representation of a | is really discreet SHOULD be avoided. The representation of a | |||
floating point number SHOULD NOT use the "scientific notation" | floating point number SHOULD NOT use the "scientific notation" | |||
(e.g. 6.02e23 or -3e-45), since some systems might not be able to | (e.g. 6.02e23 or -3e-45), since some systems might not be able to | |||
parse them correctly. Values with a base other then 10 SHOULD be | parse them correctly. Values with a base other then 10 SHOULD be | |||
skipping to change at page 14, line 21 ¶ | skipping to change at page 19, line 5 ¶ | |||
since there are obviously only integer numbers included. | since there are obviously only integer numbers included. | |||
2. The UOM for time is s, meaning seconds, SI-Prefixes (e.g. | 2. The UOM for time is s, meaning seconds, SI-Prefixes (e.g. | |||
ms for milli seconds) are allowed if necessary or useful. | ms for milli seconds) are allowed if necessary or useful. | |||
3. In general, SI units and SI prefixes MAY be used as UOM if | 3. In general, SI units and SI prefixes MAY be used as UOM if | |||
applicable, but the Monitoring System may not understand | applicable, but the Monitoring System may not understand | |||
them correctly (mostly in uncommon cases), in that cases | them correctly (mostly in uncommon cases), in that cases | |||
appropriate workarounds MAY be applied on the side of the | appropriate workarounds MAY be applied on the side of the | |||
Monitoring Plugin. Since the values are not intented to | Monitoring Plugin. Since the values are not intented to | |||
be human readable normalized units are recommended (e.g. | be human readable, normalized units are recommended (e.g. | |||
overall_power=14000000000W instead of overall_power=14GW) | overall_power=14000000000W instead of overall_power=14GW) | |||
4. warn and crit are the threshold values for this | 4. warn and crit are the threshold values for this | |||
measurement, which may have been given by the user as | measurement, which may have been given by the user as | |||
input, may be hardcoded in the Monitoring Plugin or may be | input, may be hardcoded in the Monitoring Plugin or may be | |||
retrieved from a file or a device or somewhere else during | retrieved from a file or a device or somewhere else during | |||
the execution of the Monitoring Plugin. The unit used | the execution of the Monitoring Plugin. The unit used | |||
MUST be the same as for _value_. These values are not | MUST be the same as for _value_. These values are not | |||
simple numbers, but range expressions (Section 2.1). | simple numbers, but range expressions (Section 2.1). | |||
5. min and max are the minimal respectively the maximal value | 5. min and max are the minimal respectively the maximal value | |||
the value could possibly be. The unit MUST be the same as | the value could possibly be. The unit MUST be the same as | |||
for value. These values can be omitted, if the value is a | for value. These values can be omitted, if the value is a | |||
percentage value, since min and max are always 0 and 100 | percentage value, since min and max are always 0 and 100 | |||
in this case. | in this case. | |||
6. Implementation Status | 5. Implementation Status | |||
The interface metioned here is implemented by several network | The interface metioned here is implemented by several network | |||
monitoring systems. A non-exhaustive list of these systems includes: | monitoring systems. A non-exhaustive list of these systems includes: | |||
* Icinga 2 | * Icinga 2 | |||
* Naemon | * Naemon | |||
* Nagios | * Nagios | |||
skipping to change at page 15, line 4 ¶ | skipping to change at page 19, line 37 ¶ | |||
* Icinga 2 | * Icinga 2 | |||
* Naemon | * Naemon | |||
* Nagios | * Nagios | |||
The other side of the interface is implemented by several different | The other side of the interface is implemented by several different | |||
projects, again in an non-exhaustive list: | projects, again in an non-exhaustive list: | |||
* The Monitoring Plugins Project | * The Monitoring Plugins Project | |||
* The Nagios Plugins Project | * The Nagios Plugins Project | |||
* The Linuxfabrik Monitoring Plugins | * The Linuxfabrik Monitoring Plugins | |||
* Madrisan Nagios Plugins | * Madrisan Nagios Plugins | |||
7. Security Considerations | 6. Security Considerations | |||
Special security considerations are hard to define regarding this | Special security considerations are hard to define regarding this | |||
topic. Regarding the implementation of this interface, the usual | topic. Regarding the implementation of this interface, the usual | |||
programming security considerations should apply (e.g. sanitize | programming security considerations should apply (e.g. sanitize | |||
inputs), but the risks and problems regarding security are dependent | inputs), but the risks and problems regarding security are dependent | |||
on the specific implementation and usage. | on the specific implementation and usage. | |||
8. IANA Considerations | 7. IANA Considerations | |||
This document has no IANA actions. | This document has no IANA actions. | |||
9. Normative References | 8. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/rfc/rfc2119>. | <https://www.rfc-editor.org/rfc/rfc2119>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>. | May 2017, <https://www.rfc-editor.org/rfc/rfc8174>. | |||
End of changes. 33 change blocks. | ||||
90 lines changed or deleted | 286 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |