aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSergey Poznyakoff <gray@gnu.org.ua>2007-05-01 07:59:58 +0000
committerSergey Poznyakoff <gray@gnu.org.ua>2007-05-01 07:59:58 +0000
commit732036d9d146940806635402ce7a208fa847e12f (patch)
treeddf53ac4e9d65429e86479f6d918983dda74aac1
parent4dd5e7bb0306f5225b8e307dacde98807d3676bb (diff)
downloadmailfromd-git-alpha_3_1_91_berkeley_txn.tar.gz
mailfromd-git-alpha_3_1_91_berkeley_txn.tar.bz2
git-svn-id: file:///svnroot/mailfromd/branches/alpha_3_1_91_berkeley_txn@1401 7a8a7f39-df28-0410-adc6-e0d955640f24
-rw-r--r--ChangeLog8
-rw-r--r--NEWS28
-rw-r--r--doc/mailfromd.texi566
-rw-r--r--doc/mtasim.texi8
-rw-r--r--gacopyz/gacopyz.h7
-rw-r--r--gacopyz/log.c20
-rw-r--r--gacopyz/smfi.c14
-rw-r--r--src/bi_db.m424
-rw-r--r--src/bi_io.m42
-rw-r--r--src/cache.c29
-rw-r--r--src/dnscache.c5
-rw-r--r--src/engine.c12
-rw-r--r--src/gram.y132
-rw-r--r--src/lex.l2
-rw-r--r--src/mailfromd.h6
-rw-r--r--src/main.c22
-rw-r--r--src/prog.c12
-rw-r--r--src/rate.c1
18 files changed, 718 insertions, 180 deletions
diff --git a/ChangeLog b/ChangeLog
index 3ac6ab6d..6cd84276 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,14 @@
+2007-05-01 Sergey Poznyakoff <gray@gnu.org.ua>
+
+ * src/lex.l, src/engine.c, src/dnscache.c, src/gram.y,
+ src/mailfromd.h, src/cache.c, src/prog.c, src/bi_io.m4,
+ src/main.c, src/rate.c, src/bi_db.m4, doc/mailfromd.texi,
+ doc/mtasim.texi, gacopyz/smfi.c, gacopyz/gacopyz.h, gacopyz/log.c,
+ NEWS: Port r1400 from trunk
+
2007-04-25 Sergey Poznyakoff <gray@gnu.org.ua>
Synchronize with the trunk
2007-04-18 Sergey Poznyakoff <gray@gnu.org.ua>
diff --git a/NEWS b/NEWS
index ee23f495..0bd3c443 100644
--- a/NEWS
+++ b/NEWS
@@ -1,7 +1,7 @@
-Mailfromd NEWS -- history of user-visible changes. 2007-04-25
+Mailfromd NEWS -- history of user-visible changes. 2007-04-27
Copyright (C) 2005, 2006, 2007 Sergey Poznyakoff
See the end of file for copying conditions.
Please send mailfromd bug reports to <bug-mailfromd@gnu.org.ua>
@@ -14,12 +14,38 @@ Version 3.1.92, SVN
mailfromd filter scripts. It supports stdio (-bs) and daemon (-bd)
modes, has GNU readline support and `expect' facility, which makes it
useful in automated test cases.
See the documentation, chapter `mtasim'.
+* `begin'/`end' handlers
+
+ The `begin' and `end' special handlers may be used to
+supply startup and cleanup code for the filter program.
+
+ The `begin' special handler is executed once for each
+SMTP session, after the connection has been established but
+before the first milter handler has been called. Similarly, an
+`end' handler is executed exactly once, after the connection has
+been closed. Neither of handlers takes any arguments.
+
+ See the documentation, section `begin/end'.
+
+* Cache control
+
+ Use function `db_set_active' to enable or disable given cache
+database. E.g.
+
+ # Disable DNS cache:
+ db_set_active("dns", 0)
+ # Enable it back again:
+ db_set_active("dns", 0)
+
+Similarly, the function `db_get_active' returns a number indicating
+whether the given cache database is used or not.
+
Version 3.1.91, 2007-04-23
* Non-blocking syslog
This version is shipped with non-blocking syslog implementation by
diff --git a/doc/mailfromd.texi b/doc/mailfromd.texi
index 6ba94777..741499b1 100644
--- a/doc/mailfromd.texi
+++ b/doc/mailfromd.texi
@@ -90,13 +90,13 @@ documents @command{mailfromd} Version @value{VERSION}.
* Intro:: Introduction to Mailfromd.
* Building:: Building the Package.
* Tutorial:: Mailfromd Tutorial.
* MFL:: The Mail Filtering Language.
* Mailfromd Configuration:: Configuring @command{mailfromd}.
* Sendmail Configuration:: Configuring Sendmail to use @command{mailfromd}.
-* mtasim:: MTA simulator.
+* mtasim:: An @acronym{MTA} simulator.
* Reporting Bugs:: How to Report a Bug.
Appendices
* Gacopyz::
* Time and Date Formats::
@@ -136,12 +136,13 @@ Tutorial
* Greylisting::
* Local Account Verification::
* Databases::
* Testing Filter Scripts::
* Logging and Debugging::
* Runtime errors::
+* Cautions::
Databases
* Database Formats::
* Basic Database Operations::
* Database Maintenance::
@@ -154,13 +155,14 @@ Mail Filtering Language
* Literals::
* Here Documents::
* Sendmail Macros::
* Constants::
* Variables::
* Back references::
-* Handlers::
+* Handlers::
+* begin/end::
* Functions:: Functions.
* Expressions:: Expressions.
* Statements::
* Conditionals:: Conditional Statements.
* Loops:: Loop Statements.
* Exceptions:: Exceptional Conditions and their Handling.
@@ -380,20 +382,20 @@ C: QUIT
always discarded.
@cindex strict address verification
The described method of address verification is called
@dfn{standard} method throughout this document. @command{Mailfromd}
also implements a method we call @dfn{strict}. When using strict
-method, @command{mailfromd} first resolves IP address of sender
+method, @command{mailfromd} first resolves @acronym{IP} address of sender
machine to a fully qualified domain name. Then it obtains MX records
for this machine, and then proceeds with probing as described above.
So, the difference between the two methods is in the set of MX
records that are being probed: standard method queries MXs based on
the sender email domain, strict method works with MXs for the sender
-IP address.
+@acronym{IP} address.
Strict method allows to cut off much larger amount of spam,
although it does have many drawbacks. Returning to our example above,
consider the following situation: @samp{<jsmith@@somedomain.net>} is a
perfectly normal address, but it is being used by a spammer from some
other domain, say @samp{otherdomain.com}. Standard method is not able
@@ -405,13 +407,13 @@ it depends entirely on how you will instruct it to act in this case,
but the general practice is to return temporary failure, which will
urge the remote party to retry sending his/her message later.
@cindex caching @acronym{DNS} requests
After receiving a definite answer, @command{mailfromd} will
cache it in its database, so that next time your @acronym{MTA} receives a
-message from that address (or from the sender IP/email address pair,
+message from that address (or from the sender @acronym{IP}/email address pair,
for strict method), it will not waste its time trying to reach MX
servers again. The records remain in the cache database for a certain
time, after which they are discarded.
@node Limitations
@section Limitations of Sender Address Verification
@@ -433,13 +435,13 @@ feasible for you. However, you may experiment with various
often. This drawback can to a certain extent be eliminated by raising
the expiration timeout in your cache database.
@item When verifying the remote address, no attempt to actually
deliver the message is made. If @acronym{MTA} accepts the address,
@command{mailfromd} assumes it is OK. However in reality, mail for a
-remote address can bounce @emph{after} the nearest MTA accepts the
+remote address can bounce @emph{after} the nearest @acronym{MTA} accepts the
recipient address.
This drawback can often be avoided by combining sender address
verification with greylisting (@pxref{Greylisting}).
@item If the remote server rejects the address, no attempt is being
@@ -990,13 +992,13 @@ more information about the database compaction.
@node Tutorial, MFL, Building, Top
@chapter Tutorial
This chapter contains a tutorial introduction, guiding you
through various @command{mailfromd} configurations, starting from the
-simplest ones and proceeding up to the most advanced forms. It omits
+simplest ones and proceeding up to more advanced forms. It omits
most complicated details, concentrating mainly on the
common practical tasks.
If you are familiar to @command{mailfromd} you can skip this
chapter and go directly to the next one, that contains detailed
discussion of the configuration language and @command{mailfromd}
@@ -1017,12 +1019,13 @@ interaction with the Mail Transport Agent.
* Greylisting::
* Local Account Verification::
* Databases::
* Testing Filter Scripts::
* Logging and Debugging::
* Runtime errors::
+* Cautions::
@end menu
@node Start Up
@section Start Up
@cindex @acronym{MTA}
@@ -1079,28 +1082,42 @@ state handlers}, or subroutines to be executed in various
@acronym{SMTP} states. Each interaction state can
be supplied its own handling procedure. A missing procedure implies
@code{continue} response code.
@cindex milter state handler, described
@cindex handler, described
+@cindex connect, handler
+@cindex helo, handler
+@cindex envfrom, handler
+@cindex envrcpt, handler
+@cindex header, handler
+@cindex eoh, handler
+@cindex body, handler
+@cindex eom, handler
+@cindex begin, special handler
+@cindex end, special handler
@anchor{handler names}
A filter script can define up to eight @dfn{milter state handlers},
called after the names of milter states: @samp{connect}, @samp{helo},
@samp{envfrom}, @samp{envrcpt}, @samp{header}, @samp{eoh},
-@samp{body}, and @samp{eom}. The diagram below shows the control flow
-when processing an @acronym{SMTP} transaction. Lines marked with
-@code{C:} show @acronym{SMTP} commands issued by the remote machine
-(the @dfn{client}), those marked with @samp{@result{}} show called handlers
+@samp{body}, and @samp{eom}. Two special handlers are available for
+initialization and cleran-up purposes: @samp{begin} is called before
+the processing starts, and @samp{end} is called after it is finished.
+The diagram below shows the control flow when processing an
+@acronym{SMTP} transaction. Lines marked with @code{C:} show
+@acronym{SMTP} commands issued by the remote machine (the
+@dfn{client}), those marked with @samp{@result{}} show called handlers
with their arguments. An @r{[R]} appearing at the right end of a line
indicates that this part of the transaction can be repeated any number
of times:
@float Figure, milter-control-flow
@caption{Mailfromd Control Flow}
@smallexample
@group
+@result{} begin
@result{} connect(@var{hostname}, @var{family}, @var{port}, @samp{IP address})
C: HELO @var{domain}
helo(@var{domain})
for each message transaction
do
C: MAIL FROM @var{sender}
@@ -1121,13 +1138,14 @@ do
@result{} * at most @var{len} bytes and call:}
@result{} */
@result{} body(@var{blk}, @var{len})
C: .
@result{} eom
- done
+ done
+@result{} end
@end group
@end smallexample
@end float
This control flow is maintained for as long as each called handler
returns @code{continue} (@pxref{Actions}). Otherwise, if
@@ -1298,12 +1316,13 @@ can be invoked elsewhere as many times as needed.
All functions have a @dfn{definition} that introduces types and
names of the formal parameters and the result type, if the function is
to return a meaningful value (function definitions in @acronym{MFL}
are discussed in detail in @pxref{User-defined, User-Defined Functions}).
+@anchor{funcall}
@cindex function calls
A function is invoked using a special construct, @dfn{function
call}:
@smallexample
@var{name} (@var{arg-list})
@@ -1329,13 +1348,13 @@ hostname($client_addr)
hostname $client_addr
@end smallexample
@noindent
However, such syntax creates several ambiguities, so use it sparingly
if at all. We recommend to always use parentheses when calling a
-function. @FIXME{explain why.}
+function; @xref{Cautions}, for the detailed analysis of of this syntax.
When a function does not deliver a result, it should only be called
as a statement.
Functions may be recursive, even mutually recursive.
@@ -1352,13 +1371,13 @@ functions are always available, no preparatory work is needed before
calling them. In contrast, the library functions are defined in
@dfn{modules}, special @acronym{MFL} source files grouping functions
designed for a particular task. In order to access a library
function, you must first @dfn{require} a module it is defined in.
This is done using @code{#require} statement. For example, the
function @code{hostname} looks up in the @acronym{DNS} the name
-corresponding to the IP address specified as its argument. This
+corresponding to the @acronym{IP} address specified as its argument. This
function is defined in module @file{dns.mf}, so before calling it you
must require this module:
@smallexample
#require dns
@end smallexample
@@ -1374,15 +1393,15 @@ module on disk and loads it if it is available.
@section Domain Name System
Site administrators often do not wish to accept mail from hosts that
do not have a proper reverse delegation in the Domain Name System.
In the previous section we introduced the library function
@code{hostname}, that looks up in the @acronym{DNS} the name corresponding to
-the IP address specified as its argument. If there is no
+the @acronym{IP} address specified as its argument. If there is no
corresponding name, the function returns its argument unchanged. This
-can be used to test if the IP was resolved, as illustrated in the
+can be used to test if the @acronym{IP} was resolved, as illustrated in the
example below:
@smallexample
@group
#require dns
@@ -1396,13 +1415,13 @@ done
@end smallexample
The @code{#require dns} statement loads the module @file{dns.mf},
after which the definition of @code{hostname} becomes available.
An orthogonal function, @code{resolve}, which resolves the symbolic
-name to the corresponding IP address is provided in the same
+name to the corresponding @acronym{IP} address is provided in the same
@file{dns.mf} module.
@node Checking Sender Address
@section Checking Sender Address
The main purpose of @command{mailfromd} is verification of the
@@ -1483,16 +1502,16 @@ do
fi
done
@end group
@end smallexample
@node SMTP Timeouts
-@section SMTP Timeouts
+@section @acronym{SMTP} Timeouts
When using polling functions, it is important to take into account
-possible delay, which can occur in SMTP transactions. Most often
+possible delay, which can occur in @acronym{SMTP} transactions. Most often
such delays are due to low network bandwidth, but sometimes remote
sites impose them willingly, as a spam-fighting measure@footnote{My
private opinion is that such practice is completely lame.}
@cindex Timeouts, defined
@command{Mailfromd} polling functions implement three distinct
@@ -1625,14 +1644,14 @@ done
@end smallexample
@FIXME{Proposed by Jan:
Another way to avoid infinite looping caused by endless recursive
triggering of @code{on poll}, is to accept relaying of all email originating
-from the local IP cluster with (trusted) clients and SMTP servers,
-provided that the server running mailfromd falls within this IP range:
+from the local @acronym{IP} cluster with (trusted) clients and @acronym{SMTP} servers,
+provided that the server running mailfromd falls within this @acronym{IP} range:
@smallexample
@group
prog envfrom
do
if $f == ""
@@ -1651,18 +1670,18 @@ do
fi
done
@end group
@end smallexample
Here, triggering the @code{on poll} statement with more than 1 recursion
-is avoided for all local emails, originating from non-local IPs
-(outside of CIDR range 193.232.0.0/16) - when such an email arrives,
+is avoided for all local emails, originating from non-local @acronym{IP}s
+(outside of @acronym{CIDR} range 193.232.0.0/16) - when such an email arrives,
handler execution falls through to @code{on poll}, which will cause
the server to connect back to itself for local email verification, but
this time, the @code{on poll} check will be skipped, as the server's
-own IP address will be caught by @code{match_cidr} statement.
+own @acronym{IP} address will be caught by @code{match_cidr} statement.
This method has particular advantage over the previous one: it
does not rely on sendmail's relay-domains control, which can be,
alone, too wide for sane relaying control.
}
@@ -1913,13 +1932,13 @@ interval, which is an integer number. For example, the number
(messages per hour).
Wherever the @code{rate} function is called, it recomputes and
updates the rate record for the given @var{key}, and returns its
value, converted to messages per interval. For example, the following
code limits the mail sending rate for each @samp{email
-address}-@samp{IP} combination to 180 per hour. If the rate value is
+address}-@samp{@acronym{IP}} combination to 180 per hour. If the rate value is
exceeded, the sender is returned a temporary failure response:
@smallexample
@group
prog envfrom
do
@@ -1948,13 +1967,13 @@ intervals are discussed in @ref{time interval specification}.
@node Greylisting
@section Greylisting
Greylisting is a simple method of defending against the spam
proposed by Evan Harris. In few words, it consists in recording the
-@samp{sender IP}-@samp{sender email}-@samp{recipient email} triplet of
+@samp{sender @acronym{IP}}-@samp{sender email}-@samp{recipient email} triplet of
mail transactions. Each time the unknown triplet is seen, the
corresponding message is rejected with @code{tempfail} code. If the
mail is legitimate, this will make the originating server will retry
the delivery later, at which time the destination will accept it. If,
however, the mail is a spam, it will probably never be retried, so
the users will not be bothered by it. Even if the spammer will retry
@@ -2017,13 +2036,13 @@ do
fi
done
@end group
@end smallexample
In real life you will have to avoid greylisting some messages, in
-particular those coming from the @samp{<>} address and from the IP
+particular those coming from the @samp{<>} address and from the @acronym{IP}
addresses in your relayed domain. It can easily be done using the
techniques described in previous sections and is left as an exercises
to the reader.
@anchor{whitelisting}
@cindex Whitelisting
@@ -2193,13 +2212,13 @@ space. The first field is always the expiration date for this record
in seconds since the Epoch (00:00:00 UTC, January 1, 1970). The
meaning of the rest of the fields depends on the lookup type as
described in the following table:
@table @asis
@item A
- Each field contains the next IP address corresponding to the lookup
+ Each field contains the next @acronym{IP} address corresponding to the lookup
key. Notice, that currently (version @value{VERSION}) there can be at
most one field here, but it may change in the future.
@item PTR
Each field contains a host name corresponding to the lookup
key. Notice, that currently (version @value{VERSION}) there can be at
@@ -2910,12 +2929,110 @@ the error and fix it.
@cindex @code{stack_trace} function, introduced
You can also request a stack trace any place in your code, by
calling the @code{stack_trace} function. This can be useful for
debugging, or in your @code{catch} statements.
+@node Cautions
+@section Warnings about some slippery places in @acronym{MFL}
+
+@quotation
+It seemed like a good idea at the time.
+
+--- Brian Kernighan
+@end quotation
+
+ There are some features of @acronym{MFL} which, when used improperly,
+may lead to subtle, hard identifiable errors. These are: concatenation
+operation (@pxref{Concatenation}) and passing arguments to
+one-argument functions without parentheses (@pxref{funcall, Function
+call syntax}).
+
+ Since there is no explicit operator for concatenation, it is often
+necessary to ensure that it happens at the right time by using
+parentheses to enclose the items to concatenate. Consider the
+following example:
+
+@smallexample
+echo toupper "some" "thing"
+@end smallexample
+
+ Should it print @samp{SOMETHING} or just @samp{SOMEthing}? The
+correct answer is the former, but it is difficult to deduce unless you
+are well acquainted with the @acronym{MFL} precedence rules
+(@pxref{Precedence}). Therefore, the rule of thumb is: whenever in
+doubt, parenthesize:
+
+@smallexample
+echo toupper("some" "thing") @result{} "SOMETHING"
+echo toupper("some") "thing" @result{} "SOMEthing"
+@end smallexample
+
+ Quoteless literals (@pxref{Literals}) are yet another dangerous
+feature. Just as the features mentioned above, it stems from the
+good old days when @acronym{MFL} was small and sweet and using
+literals without quotes indeed ``seemed a good idea at the time.'' It
+ceased to seem so after the introduction of user-defined functions,
+though. Consider the following @emph{entire} program text:
+
+@smallexample
+@group
+prog envfrom
+do
+ if hostname($client_addr) = $client_addr
+ reject
+ fi
+done
+@end group
+@end smallexample
+
+ The intent was obviously to reject any mail if it comes from an
+address without a proper @code{PTR} record (@pxref{hostname
+function}). There is a serious error, however: @code{hostname} is not
+a built-in function as it used to be in previous releases@footnote{Up to
+the version 1.3.91.}, and therefore it needs to be defined or required
+prior to using. Otherwise it is no more than a literal, and the whole
+construct @samp{hostname($client_addr)} is regarded by @acronym{MFL}
+compiler as a concatenation of the string @samp{hostname} and the
+value of @samp{client_addr} Sendmail variable. It is easy to see
+using @option{--dump-tree} option:
+
+@smallexample
+$ @kbd{mailfromd --dump-tree test.mf}
+State handlers:
+---------------
+envfrom:
+COND:
+EQ
+ CONCAT:
+ STRING: "hostname"
+ SYMBOL: @{client_addr@}
+ SYMBOL: @{client_addr@}
+IFTRUE
+ reject
+IFFALSE
+@end smallexample
+
+ In effect, the comparison is always false and @code{reject} is never
+called.
+
+ That is why starting from version 3.0 @command{mailfromd} warns
+about any occurrence of an unquoted identifier. In fact, running
+@option{--lint} on the above program, gives:
+
+@smallexample
+$ @kbd{mailfromd --lint test.mf}
+mailfromd: test.mf:3: warning: unquoted identifier `hostname'
+@end smallexample
+
+ Whenever you see such a message, be sure to inspect the source and
+to place quotes around the suspicious string, if it is intended to be
+used as a literal, or to require the corresponding module
+(@pxref{Modules}) (or include the source file directly,
+@pxref{include}), if it is indeed a function name.
+
@node MFL, Mailfromd Configuration, Tutorial, Top
@chapter Mail Filtering Language
@cindex MFL
@cindex Mail filtering language
The @dfn{mail filtering language}, or @acronym{MFL}, is a special
@@ -2935,13 +3052,14 @@ amount of white-space characters (i.e. spaces, tabulations or newlines).
* Literals::
* Here Documents::
* Sendmail Macros::
* Constants::
* Variables::
* Back references::
-* Handlers::
+* Handlers::
+* begin/end::
* Functions:: Functions.
* Expressions:: Expressions.
* Statements::
* Conditionals:: Conditional Statements.
* Loops:: Loop Statements.
* Exceptions:: Exceptional Conditions and their Handling.
@@ -3124,13 +3242,13 @@ quotes, at your option).
@item boolean
A boolean value: @code{yes}, @code{true} or @code{t} to indicate
a true value, and @code{no}, @code{false} or @code{nil} to indicate a
false value.
@item address
- An IP address in ``dotted-quad'' notation or a fully-qualified host
+ An @acronym{IP} address in ``dotted-quad'' notation or a fully-qualified host
name.
@item interval
@cindex Time Interval Specification
@anchor{time interval specification}
The @dfn{time interval specification} is a string that defines an
@@ -3197,13 +3315,13 @@ returns temporary failure. The default value is
@value{CONNECT-TIMEOUT}. @xref{SMTP Timeouts}, for the
detailed description.
@end deffn
@deffn {pragma option} initial-response-timeout @var{interval}
@xprindex{initial-response-timeout}
- Sets the time to wait for the initial SMTP response. Default is
+ Sets the time to wait for the initial @acronym{SMTP} response. Default is
@value{INITIAL-RESPONSE-TIMEOUT}. @xref{SMTP Timeouts}, for
the detailed description.
@end deffn
@deffn {pragma option} io-timeout @var{interval}
@deffnx {pragma option} timeout @var{interval}
@@ -3355,13 +3473,13 @@ port type is not yet supported.
@end table
@end deffn
@deffn {pragma option} milter-timeout @var{interval}
@xprindex{milter-timeout}
Set the timeout value for connection between the filter
-and the MTA. Default value is @value{MILTER-TIMEOUT}. You
+and the @acronym{MTA}. Default value is @value{MILTER-TIMEOUT}. You
normally do not need to change this value.
@end deffn
The following options can be used to tune database file locking:
@deffn {pragma option} lock-retry-count @var{number}
@@ -4139,13 +4257,13 @@ function (@pxref{ClamAV}).
qualified domain name of the host where @command{mailfromd} is run.
@xref{Polling}.
@end deftypevr
@deftypevar {Predefined Variable} string last_poll_host
Polling functions (@pxref{Polling functions}) set this variable before
-returning. It contains the host name or IP address of the last polled host.
+returning. It contains the host name or @acronym{IP} address of the last polled host.
@end deftypevar
@deftypevar {Predefined Variable} string last_poll_recv
Polling functions (@pxref{Polling functions}) set this variable before
returning. It contains the last @acronym{SMTP} reply received from
the remote host. In case of multi-line replies, only the first line is
@@ -4289,42 +4407,42 @@ the handler body. Some handlers take arguments, which can be accessed
within the @var{handler-body} using the notation @var{$@var{n}},
where @var{n} is the ordinal number of the argument. Here we describe
the available handlers and their arguments:
@deffn {Handler} connect (string $1, number $2, number $3, string $4)
@subsubheading Invocation
-This handler is called once at the beginning of each SMTP connection.
+This handler is called once at the beginning of each @acronym{SMTP} connection.
@subsubheading Arguments
@enumerate 1
@item @code{string};
-The host name of the message sender, as reported by MTA. Usually it
+The host name of the message sender, as reported by @acronym{MTA}. Usually it
is determined by a reverse lookup on the host address. If the reverse
-lookup fails, @samp{$1} will contain the message sender's IP address
+lookup fails, @samp{$1} will contain the message sender's @acronym{IP} address
enclosed in square brackets (e.g. @samp{[127.0.0.1]}).
@item @code{number};
Socket address family. Include @file{status.mfh} to get symbolic
definitions for the address families. Supported families are:
@cindex FAMILY_STDIO
@cindex FAMILY_UNIX
@cindex FAMILY_INET
@multitable @columnfractions 0.20 .10 0.70
@headitem Constant @tab Value @tab Meaning
-@item FAMILY_STDIO @tab 0 @tab Standard input/output (the MTA is
+@item FAMILY_STDIO @tab 0 @tab Standard input/output (the @acronym{MTA} is
run with @option{-bs} option)
@item FAMILY_UNIX @tab 1 @tab @acronym{UNIX} socket
-@item FAMILY_INET @tab 2 @tab IPv4 protocol
+@item FAMILY_INET @tab 2 @tab @acronym{IP}v4 protocol
@end multitable
@item @code{number};
Port number if @samp{$2} is @samp{FAMILY_INET}.
@item @code{string};
-Remote IP address if @samp{$2} is @samp{FAMILY_INET} or full file name
+Remote @acronym{IP} address if @samp{$2} is @samp{FAMILY_INET} or full file name
of the socket if @samp{$2} is @samp{FAMILY_UNIX}. If @samp{$2} is
@samp{FAMILY_STDIO}, @samp{$4} is an empty string.
@end enumerate
@cindex actions, using in @code{connect} handler
The actions (@pxref{Actions}) appearing in this handler
@@ -4371,15 +4489,15 @@ command, excepting ones listed above, is answered with
@end table
Regarding reply codes, this behavior complies with @acronym{RFC}
2821 (section 3.9), which states:
@quotation
- An SMTP server MUST NOT intentionally close the connection except:
+ An @acronym{SMTP} server @emph{must not} intentionally close the connection except:
@dots{}
- - After detecting the need to shut down the SMTP service and
+ - After detecting the need to shut down the @acronym{SMTP} service and
returning a 421 response code. This response code can be issued
after the server receives any command or, if necessary,
asynchronously from command receipt (on the assumption that the
client will receive it after the next command is issued).
@end quotation
@@ -4396,33 +4514,33 @@ with the action. The patch is in the file
versions of Sendmail up to 8.14.
@end deffn
@deffn {Handler} helo (string $1)
@subsubheading Invocation
-This handler is called whenever the SMTP client sends @code{HELO} or
-@code{EHLO} command. Depending on the actual MTA configuration, it
+This handler is called whenever the @acronym{SMTP} client sends @code{HELO} or
+@code{EHLO} command. Depending on the actual @acronym{MTA} configuration, it
can be called several times or even not at all.
@subsubheading Arguments
@enumerate 1
@item @code{string}; Argument to @code{HELO} (@code{EHLO}) commands.
@end enumerate
@subsubheading Notes
According to @acronym{RFC} 28221, @code{$1} must be the domain name of the
-sending host, or, in case this is not available, its IP address
+sending host, or, in case this is not available, its @acronym{IP} address
enclosed in square brackets. Be careful when taking decisions based
on this value, because in practice many hosts send arbitrary strings.
We recommend to use @code{heloarg_test} function
(@pxref{heloarg_test}) if you wish to analyze this value.
@end deffn
@deffn {Handler} envfrom (string $1, string $2)
@subsubheading Invocation
-Called when the SMTP client sends @code{MAIL FROM} command, i.e. once
+Called when the @acronym{SMTP} client sends @code{MAIL FROM} command, i.e. once
at the beginning of each message.
@subsubheading Arguments
@enumerate 1
@item @code{string}; First argument to the @code{MAIL FROM} command,
i.e. the email address of the sender.
@@ -4456,13 +4574,13 @@ by space character. This argument can be @samp{""}.
When the array type is implemented, @code{$2} will contain
an array of arguments.
@end deffn
@deffn {Handler} header (string $1, string $2)
@subsubheading Invocation
-Called once for each header line received after SMTP @code{DATA} command.
+Called once for each header line received after @acronym{SMTP} @code{DATA} command.
@subsubheading Arguments
@enumerate 1
@item @code{string}; Header field name.
@item @code{string}; Header field value. The content of the header may
include folded white space, i.e., multiple lines with following white
space where lines are separated by LF (ASCII 10). The trailing line
@@ -4523,13 +4641,161 @@ For your reference, the following table shows each handler with its arguments:
@item body @tab Body segment (string) @tab Length of the segment
(numeric) @tab N/A @tab N/A
@item eom @tab N/A @tab N/A @tab N/A @tab N/A
@end multitable
@end float
+@node begin/end
+@section The @samp{begin} and @samp{end} special handlers
+@cindex begin, special handler
+@cindex end, special handler
+@cindex startup handler
+@cindex handler, startup
+@cindex handler, initialization
+@cindex cleanup handler
+@cindex handler, cleanup
+ Apart from milter handlers defined previously, @acronym{MFL}
+defines two special handlers, called @samp{begin} and @samp{end},
+which supply startup and cleanup instructions for the filter program.
+
+ The @samp{begin} special handler is executed once for each
+@acronym{SMTP} session, after the connection has been established but
+before the first milter handler has been called. Similarly, an
+@samp{end} handler is executed exactly once, after the connection has
+been closed. Neither of handlers takes any arguments.
+
+@kwindex begin
+@kwindex end
+ The two handlers are defined using the following syntax:
+
+@smallexample
+# @r{Begin handler}
+begin
+do
+ @dots{}
+done
+
+# @r{End handler}
+end
+do
+ @dots{}
+done
+@end smallexample
+
+@noindent
+where @samp{@dots{}} represent any @acronym{MFL} statements.
+
+ An @acronym{MFL} program may have multiple @samp{begin} and
+@samp{end} definitions. They can be intermixed with other
+definitions. The compiler combines all @samp{begin}
+statements into a single one, in the order they appear in the
+sources. Similarly, all @samp{end} blocks are concatenated together.
+The resulting @samp{begin} is called once, at the beginning of each
+@acronym{SMTP} session, and @samp{end} is called once at its
+termination.
+
+ Multiple @samp{begin} and @samp{end} handlers are a useful feature
+for writing modules (@pxref{Modules}), because each module can