aboutsummaryrefslogtreecommitdiff
path: root/README
blob: 601cc63cc93aedcd53bd9d1f4de71ba239a55f9e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
* Overview

Tallyman is a tool for monitoring health status of docker containers
and reporting it via SNMP.

* Tha Package

The package provides two executable files:

** tallyman

  A helper program to be run as HEALTHCHECK CMD within containers

** stevedore

  SNMP agent for serving the collected statistics

In addition, the file TALLYMAN-MIB.txt contains the Management
Information Base for monitoring container status.

* Container Configuration

It is supposed that each container is responsible for certain
"service". Each service is assigned a name. Multiple containers can run
the same service (for example you can have several database
containers).

Containers are configured to run tallyman as their healthcheck
command. The utility takes two or more arguments. First argument is
the name of the service the container is responsible for. Rest of
arguments supply the name of the actual health-checking program and
its command line arguments. Tallyman will run this command, collect
its standard error and standard output, pack them along with the
program exit code in a JSON packet, and send this packat to the
predefined address using HTTP POST request. It will then exit with the
same code as the health-checking program it ran. To the container, the
effect of running tallyman is the same as if it ran the
health-checking program itself: error code, standard error and
standard output are all preserved. On the other hand, they are copied
to the collector listening on the predefined address outside the
container. This collector is the "stevedore" program, described below.

Suppose for example, that you run several database containers running
MySQL and name the corresponding service "DB". You could then specify
the following statement in the Dockerfile for creating these
containers:

  HEALTHCHECK CMD /sbin/tallyman DB mysqladmin ping

* Stevedore: the Collector Daemon

Stevedore performs two important tasks. First, it collects health
reports coming from various containers and stores them in its cache.
Secondly, it acts as a subagent of the snmpd daemon, serving these
data on request.

By default, stevedore listens on port 8990 on all available
interfaces. On the other hand, tallyman sends its report to port 8990
on the gateway address of the container it runs in. This means that
for so long as you have only one docker farm, you don't need to
explicitly configure IP address or port on either side.

If you have several servers running docker containers, you can supply
the address of the collector to tallyman using the -s (--server)
option.

Stevedore reads its configuration from file named /etc/stevedore.conf.
The configuration consists of statements. Each statement begins with
a keyword, followed by one or more arguments and is terminated with a
semicolon. Whitespace characters (horizontal space, tabulation and
newline) are ignored except as they serve to separate tokens. Comments
can be introduced by '#' and '//', in which case they extend to the end of
the physical line, or enclosed between '/*' and '*/', in which case they
can occupy multiple lines. For a detailed discussion of the available
keywords, see stevedore(3) or run

   stevedore --config-help

which will output a succinct summary. Here we will mention only the
most important (and actually, the only required) statement:

   service NAME ;

This statement informs stevedore that it will be receiving updates
about the service NAME. There must be a separate "service" statement
for each service you are planning to monitor.

Continuing the example from the previous section, after configuring
the HEALTHCHECK in the container setup, you will need to add the line

   service DB;

to your /etc/stevedore.conf and restart the daemon.

* Configuring snmpd

Add the following statement to the /etc/snmp/snmpd.conf file:

   master agentx

Depending on the privileges with which stevedore is run, you might
need to add the agentXPerms statement to fix up ownership and
permissions of the agentx socket. Please refer to the snmpd(8) and
snmpd.conf(5) documentation for details.

* How to Build

Prerequisites:

 - Net-SNMP            <http://www.net-snmp.org>
 - libmicrohttpd       <https://www.gnu.org/software/libmicrohttpd>

Usual incantations apply:

   ./configure
   make
   make install

Obviously, the last command requires root privileges.

Please, refer to the file INSTALL for details about common options to
configure. Apart from ones discussed there, the following two are of
interest:

** --without-preprocessor

By default, stevedore uses m4(1) to preprocess its configuration file
prior to parsing. Use this option to disable preprocessing.

** --with-mibdir=DIR

Where to install the TALLYMAN-MIB.txt file. Default is

  $(datarootdir)/snmp/mibs

where $(datarootdir) is the directory for read-only architecture-independent 
data.

* Note to the Packagers

It is convenient to split the package into two installable packages:
tallyman, to be used inside containers, and stevedore, to be used on
the host server.


* Copyright information:

Copyright (C) 2018 Sergey Poznyakoff

   Permission is granted to anyone to make or distribute verbatim copies
   of this document as received, in any medium, provided that the
   copyright notice and this permission notice are preserved,
   thus giving the recipient permission to redistribute in turn.

   Permission is granted to distribute modified versions
   of this document, or of portions of it,
   under the above conditions, provided also that they
   carry prominent notices stating who last changed them.


Local Variables:
mode: outline
paragraph-separate: "[ 	]*$"
version-control: never
End:

Return to:

Send suggestions and report system problems to the System administrator.