aboutsummaryrefslogtreecommitdiff
path: root/src/stevedore.8
blob: 2b4e7e8839d74d8e802b1005655b41f1dce2167d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
.TH STEVEDORE 8 "October 11, 2020" "TALLYMAN" "Tallyman User Reference"
.SH NAME
stevedore \- container state collector and SNMP agent daemon
.SH SYNOPSIS
.na
.nh
\fBstevedore\fR\
 [\fB\-Fsd\fR]\
 [\fB\-f\fR \fIFILE\fR]\
 [\fB\-\-config\-file=\fIFILE\fR]\
 [\fB\-\-foreground\fR]\
 [\fB\-\-single\fR]\
 [\fB\-\-debug\fR]
.sp
\fBstevedore\fR\
 \fB\-?\fR |\
 \fB\-\-help\fR |\
 \fB\-V\fR |\
 \fB\-\-version\fR
.ad
.hy
.SH DESCRIPTION
Monitoring the health state of a collection of docker containers is
based on the premise that each container is responsible for a certain
.IR service ,
which is assigned an identifier (\fISID\fR). In the contrast to
container IDs, service IDs are not necessarily unique for each
container. It is quite OK (and even common) for several containers to
have same SID. This can happen, for example, if one runs a distributed
database server, with one container running master server and the rest
running its slaves.
.PP
Each container is supposed to run the
.BR tallyman (1)
command as part of its
.B HEALTHCHECK
configuration. This tool takes as its argument the command line that
does the actual checking, collects its return and sends it over to the
\fBstevedore\fR daemon that acts as a collector (see
.BR tallyman (1),
for details).
.PP
The purpose of \fBstevedore\fR is two-fold. First, it provides a
RESTful service that collects health check reports from multiple
containers, and secondly it acts as SNMP subagent, delivering the
collected information.
.SH CONFIGURATION
The program reads its configuration from file \fB/etc/stevedore.conf\fR
(exact location can differ depending on how the package was
configured; if unsure, examine the output of
.BR "stevedore --help" ). 
The file must exist and be readable.
.PP
The configuration consists of statements. Each statement begins with
a keyword, followed by one or more arguments and is terminated with a
semicolon. Arguments containing whitespace or special characters (
.BR { ,
.BR } ,
or
.BR ; )
must be quoted.
.PP
Whitespace characters (horizontal space, tabulation and
newline) are ignored except as they serve to separate tokens. Comments
can be introduced by \fB#\fR and \fB//\fR, in which case they extend
to the end of the physical line, or enclosed between
.BR "/* " and " */" ,
in which case they can occupy multiple lines. Comments may appear
anywhere where white space may appear in the configuration file.
.PP
.SS Statements
The following statements can appear in the configuration file:
.TP
.BI "listen " IP : PORT
Listen on this IP address and port. Default is \fB0.0.0.0:8990\fR,
i.e. all available IP addresses, port 8990.
.TP
.BI "pidfile " FILE
Store PID of the daemon process in \fIFILE\fR. If this statement is
not supplied, no pidfile will be used.
.TP
.BI "user " UID
Run as this user. \fIUID\fR is either the user login name or numeric
UID prefixed with a plus sign.
.TP
.BI "group " GID
Run with this group privileges. \fIGID\fR is either the group name or
numeric GID prefixed with a plus sign. In the absence of this
statement, the primary group of the \fIUID\fR specified with the \fBuser\fR
statement will be used. Auxiliary groups of \fIUID\fR are always honored.
.TP
.BI "service " SID
Define service to monitor. This is actually the only statement that
must be present in the configuration file. It informs \fBstevedore\fR
that it will be receiving updates about service ID \fISID\fR and
instructs it to create SNMP OIDs for reporting the state of this
service.
.sp
There should be as many \fBservice\fR statements as there are services
to monitor.
.TP
.BI "instance-state-ttl " SECONDS
Sets the time during which the state of the instance (container) is
retained in cache. If no update arrives during the specified number of
seconds, the container is marked as \fBexpired\fR. Default is 30
seconds.
.SS Hostproc notification
.B Hostproc
is an SNMP agent that provides detailed information about processes
running on a host.  The agent features extensive aggregation capabilities
that allow the system administrator to obtain various types of metric
for a group of processes.  Stevedore is able to communicate with
\fBhostproc\fR and to define process groups for each configured
service.
.PP
Process groups are updated by sending a specially formatted SNMP SET
request to the IP address of the host running the \fBhostproc\fR
agent.  Normally, this is the same host where \fBstevedore\fR is
installed, but that is not required.  The notification is configured
using the following statement:
.TP
\fBhostproc\-server\fR \fIHOST\fB;\fR
This statement sets the hostname or IP address of the server running
\fBhostproc\fR.  Optional port can be specified by following the
argument with a colon and port number.
.TP
.BI "snmp-client-config " FILE
Sets the filename of the SNMP client configuration file for
\fBhostproc\fR notification.  The \fIFILE\fR should be in the
same format as the
.BR snmp.conf (5).
.sp
In the absense of this statement, the system-wide
.B snmp.conf
will be read.
.SS Syslog configuration
Unless the program is started in foreground mode (see the \fB\-F\fR
option), its logging output goes to syslog facility \fBdaemon\fR. The
syslog configuration can be changed using the following
.IR "block statement" :
.EX
syslog {
  facility NAME;
  tag STRING;
}
.EE
.PP
The substatements are:
.TP
.BI "facility " NAME
Set syslog facility. \fINAME\fR is one of:
.BR user ,
.BR daemon ,
.BR auth ,
.BR authpriv ,
.BR mail ,
.BR cron ,
.B local0
through
.B local7
(case-insensitive), or a decimal facility number.
.TP
.BI "tag " STRING
Tag syslog messages with this string, instead of the program name.
.SH OPTIONS
.TP
\fB\-f\fR, \fB\-\-config\-file=\fIFILE\fR
Read configuration from \fIFILE\fR.
.TP
\fB\-\-config\-help\fR
Describe configuration file syntax and variables.
.TP
\fB\-F\fR, \fB\-\-foreground\fR
By default, \fBstevedore\fR disconnects itself from the controlling
terminal and runs as a daemon. This option disables this behavior,
instructing it to remain in foreground and print its diagnostic
messages on standard error, instead of using the syslog interface. Use
it for debugging.
.TP
\fB\-s\fR, \fB\-\-single\fR
By default, the program runs in two-process mode: there is a top-level
sentinel process that starts a single working process and restarts it
if it exits on error or signal. The purpose of this design is to catch
and recover from possible bugs.
.sp
This option instructs \fBstevedore\fR to start the worker process
directly.
.TP
\fB\-d\fR, \fB\-\-debug
Increase debug verbosity.
.TP
\fB\-?\fR, \fB\-\-help\fR
Display short usage summary.
.TP
\fB\-V\fR, \fB\-\-version\fR
Display program version and licensing information and exit.
.SH MIB
The MIB is kept in file \fBTALLYMAN-MIB.txt\fR which is normally
installed to the location where \fBnet-snmp\fR tools expect to find
their MIBs.
.PP
The following OIDs are defined:
.TP
.B servicesUpTime.0
Total uptime of the Stevedore server.
.TP
.B servicesTotal.0
Total number of configured services.
.TP
.B servicesRunning.0
Number of running services, i.e. services that have at least one running
container.
.TP
.B serviceTable
This branch provides a conceptual table of services with the
corresponding statistics. It is indexed by \fBserviceIndex\fR.  Each row
has the following elements:
.RS
.TP
.B serviceName
Name of the service.
.TP
.B serviceInstances
Number of running instances (containers) in this service.
.RE
.TP
.B instanceTable
This branch provides a conceptual table of instances and is indexed by
\fBinstanceIndex\fR. Each row has the following OIDs:
.RS
.TP
.B instanceName
Hostname of the instance.
.TP
.B instanceService
Service name (ID) of the instance.
.TP
.B instanceState
State of the instance. Possible values are:
.BR stopped ,
.BR running ,
.BR expired ,
and
.BR error .
.TP
.B instanceTimeStamp
Time of the last successful probe.
.TP
.B instanceErrorMessage
Error message associated with this instance if \fBinstanceState\fR is
\fBerror\fR.     
.RE
.SH "SEE ALSO"
.BR tallyman (1),
.BR snmp.conf (5),
.BR hostproc (8),
or
.BR http://puszcza.gnu.org.ua/software/hostproc .
.SH AUTHORS
Sergey Poznyakoff
.SH "BUG REPORTS"
Report bugs to <gray@gnu.org>.
.SH COPYRIGHT
Copyright \(co 2018\-2020 Sergey Poznyakoff
.br
.na
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
.br
.ad
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
.\" Local variables:
.\" eval: (add-hook 'write-file-hooks 'time-stamp)
.\" time-stamp-start: ".TH [A-Z_][A-Z0-9_.\\-]* [0-9] \""
.\" time-stamp-format: "%:B %:d, %:y"
.\" time-stamp-end: "\""
.\" time-stamp-line-limit: 20
.\" end:

Return to:

Send suggestions and report system problems to the System administrator.