Version 5.0.93

* NEWS, configure.ac: Raise patchlevel to 93. * doc/mailfromd.texi: Document new features. * mfd/tbf_rate.c (tbf_rate_format_struct): Change dbid to `tbf'.
author: Sergey Poznyakoff <gray@gnu.org.ua> 2009-05-04 23:45:51 +0300
committer: Sergey Poznyakoff <gray@gnu.org.ua> 2009-05-04 23:45:51 +0300
commit: 2e9e8087eef5452299cfc839b7b4cc23bb2feb6c (patch)
tree: c6ad298fbc1fa8bded0d37b17dd190bebb31c4b3
parent: 9503ac3090f1013ab501e863d56277cccf301a86 (diff)
download: mailfromd-2e9e8087eef5452299cfc839b7b4cc23bb2feb6c.tar.gz
mailfromd-2e9e8087eef5452299cfc839b7b4cc23bb2feb6c.tar.bz2
4 files changed, 482 insertions, 130 deletions
diff --git a/NEWS b/NEWS
index e607b9ef..ba959685 100644
--- a/NEWS
+++ b/NEWS
@@ -1,14 +1,14 @@
-Mailfromd NEWS -- history of user-visible changes. 2009-05-03
+Mailfromd NEWS -- history of user-visible changes. 2009-05-04
 Copyright (C) 2005, 2006, 2007, 2008, 2009 Sergey Poznyakoff
 See the end of file for copying conditions.
 
 Please send Mailfromd bug reports to <bug-mailfromd@gnu.org.ua>
 
 
-Version 5.0.92, GIT
+Version 5.0.93, GIT
 
 * New pragma `dbprop'
 
 This pragma defines user database properties.  It takes two or three
 arguments:
 
@@ -26,42 +26,42 @@ may appear in any order.
 The new function is provided:
 
   bool tbf_rate(string key, number cost, number interval, number burst_size)
 
 It implements a classical token bucket filter algorithm.  Tokens are
 added to the bucket identified by the `key' at constant rate of 1
-token per `interval' microseconds, to a maximum of `cost' tokens.
+token per `interval' microseconds, to a maximum of `burst_size' tokens.
 If no bucket is found for the specified key, a new bucket is created
 and initialized to contain `burst_size' tokens.
 
 For example:
 
   if not tbf_rate($f "-" ${client_addr}, 1, 10000000, 20)
     tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later"
   fi
 
 This adds a token every 10 seconds with a burst size of 20 and a
 cost of 1.  In other words, it allows to sent up to 20 emails within
-first 10 seconds after sending the very first email from the given
+the first 10 seconds after sending the very first email from the given
 email/host address pair.  After that, that pair is allowed to send
 at most 1 message per 10 seconds.
 
 One of possible implementations for this function is to limit
 the total size of messages tranferred per given amount of time.
 To do so, the tbf_rate must be used in `prog eom'.  The `cost'
-value must contain the number of bytes in an email, (or email bytes
-* number of recipients), the `interval' must be set to deliver the
-number of bytes per second a given user is allowed to send, and
-the `burst_size' must be large enough to accommodate a couple of
-large emails.  E.g.:
+value must contain the number of bytes in an email (or email bytes
+* number of recipients), the `interval' must be set to the number of
+bytes per microsecond a given user is allowed to send, and the
+`burst_size' must be large enough to accommodate a couple of large
+emails.  E.g.: 
 
   prog eom
   do
     if not tbf_rate($f "-" ${client_addr},
                     message_size(current_message()),
-                    10000,   # At most 10 kb/sec
+                    10240,   # At most 10 Kb/ms
 		    2000000) 
       tempfail 450 4.7.0 "Data sending rate exceeded. Try again later"
     fi
   done
 
 The `tbf_rate' implementation is contributed by John McEleney and
@@ -70,13 +70,13 @@ Ben McKeegan.
 * Greylisting
 
 A new implementation of the `greylist' function is provided.  In the
 contrast to the traditional implementation, which keeps in the
 database the time when the greylisting was activated for the given
 key, the new one stores the time when the greylisting period is set to
-expire.  This implementation allows to implement the `is_greylisted'
+expire.  This implementation allowed to implement the `is_greylisted'
 function:
 
   bool is_greylisted(string key)
 
 which returns True if the `key' is currently greylisted, and False
 otherwise.  This implementation is based on the patch by Con
@@ -113,21 +113,21 @@ The threshold argument is made optional in order to provide backward
 compatibility with the prior releases of mailfromd.  Nevertheless, its
 use is strongly encouraged.  To simplify the task, the new function
 `rateok' is provided (see below).
   
 * The rateok function
 
-New library function is provided:
+A new library function is provided:
 
   bool rateok(string key, number sample_span, number threshold; number mincnt)
 
 This is a higher-level interface to the rate function.  This function
-returns True if the mail sending rate for `key', for interval of
-`sample_span' seconds is less than the `threshold'.  Optional `mincnt'
-parameter supplies the minimal number of mails needed to obtain the
-statistics.  It defaults to 4.
+returns True if the mail sending rate for `key', computed for the
+interval of `sample_span' seconds is less than the `threshold'.
+Optional `mincnt' parameter supplies the minimal number of mails
+needed to obtain the statistics.  It defaults to 4.
 
 An example of rateok usage follows:
 
 #require rateok
 
 prog envfrom
@@ -139,14 +139,14 @@ done
 
 This example limits the rate to 40 mails per minute.
 
 * Rate expiration
 
 In addition to the usual expiration algorithm, the rate records are
-also expired if no mails were received during a time span longer than
-the 2nd argument to the rate (or rateok) function.
+also expired if no mails were received during a time span greater than
+the value of the 2nd argument to the rate (or rateok) function.
 
 * Bugfixes
 ** Second argument to envfrom and envrcpt
 ** write without third argument
 ** sa_format_report_header: fix formatting
 ** Limit use of file descriptors by message capturing eom functions
diff --git a/configure.ac b/configure.ac
index d2d9cb77..2c12ab5c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -11,16 +11,16 @@
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # GNU General Public License for more details.
 #
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
-AC_PREREQ(2.59)
+AC_PREREQ(2.61)
 m4_define([MF_VERSION_MAJOR], 5)
 m4_define([MF_VERSION_MINOR], 0)
-m4_define([MF_VERSION_PATCH], 92)
+m4_define([MF_VERSION_PATCH], 93)
 AC_INIT([mailfromd],
         MF_VERSION_MAJOR.MF_VERSION_MINOR[]m4_ifdef([MF_VERSION_PATCH],.MF_VERSION_PATCH),
         [bug-mailfromd@gnu.org.ua])
 AC_CONFIG_SRCDIR([mfd/main.c])
 AC_CONFIG_AUX_DIR([build-aux])
 AC_CONFIG_HEADERS([config.h])
diff --git a/doc/mailfromd.texi b/doc/mailfromd.texi
index 8a6d66b5..8276c6f9 100644
--- a/doc/mailfromd.texi
+++ b/doc/mailfromd.texi
@@ -122,27 +122,29 @@ Appendices
 @detailmenu
  --- The Detailed Node Listing ---
 
 Preface
 
 * History::                 Short @command{mailfromd} history. 
-* Conventions::             Typographical conventions.
 * Acknowledgments::         Acknowledgments.
 
 Introduction to @command{mailfromd}
 
+* Conventions::             Typographical conventions.
 * Overview::                Mailfromd at a first glance
 * SAV::                     Principles of Sender Address Verification.
 * Rate Limit::              Controlling Mail Sending Rate.
+* SPF::                     Sender Policy Framework.
 
 Sender Address Verification.
 
 * Limitations::
 
 Building the Package
 
+* 500-510::  Upgrading from 5.0 to 5.1
 * 440-500::  Upgrading from 4.4 to 5.0
 * 43x-440::  Upgrading from 4.3.x to 4.4
 * 420-43x::  Upgrading from 4.2 to 4.3.x
 * 410-420::  Upgrading from 4.1 to 4.2
 * 400-410::  Upgrading from 4.0 to 4.1
 * 31x-400::  Upgrading from 3.1.x to 4.0
@@ -212,12 +214,14 @@ Mail Filtering Language
 Comments
 
 * option::      Pragma option.
 * database::    Pragma database.
 * stacksize::   Pragma stacksize.
 * regex::       Pragma regex.
+* dbprop::      Pragma dbprop.
+* greylist::    Pragma greylist.
 
 Constants
 
 * Built-in constants::
 
 Variables
@@ -248,12 +252,14 @@ Built-in and Library Functions
 * DNS functions::
 * Database functions::
 * I/O functions::
 * System functions::
 * Sieve Interface::
 * Interfaces to Third-Party Programs::
+* Rate limiting functions::
+* Greylisting functions::
 * Special test functions::
 * Mail Sending Functions::
 * NLS Functions::
 * Debugging Functions::
 * Blacklisting Functions::
 * SPF Functions::
@@ -306,14 +312,55 @@ Command Line Options.
 * traces::
 * daemon mode::
 * option summary::
 
 Pmilter multiplexer program.
 
-* pmult invocation::
 * pmult configuration::
+* pmult example::
+* pmult invocation::
+
+Pmult Configuration
+
+* pmult-conf::     Multiplexer Configuration.
+* pmult-macros::   Translating MeTA1 macros.
+* pmult-client::   Pmult Client Configuration.
+* pmult-debug::    Debugging Pmult.
+
+Pies -- a program execution supervisor.
+
+* Pies Configuration File::
+* Component Statement::
+* include-meta1::
+* Global Configuration::
+* Pies Debugging::
+* Configuration Example::
+* Command Line Usage::
+* Pies Invocation::
+
+Component Statement
+
+* Prerequisites::
+* Component Privileges::
+* Resources::
+* Actions Before Startup::
+* Exit Actions::
+* Output Redirectors::
+* Inetd-Style Components::
+* Meta1-Style Components::
+* Component Syntax Summary::
+
+Global Configuration
+
+* Less Useful Statements::
+
+Configuration Example
+
+* Simple Pies::
+* Hairy Pies::
+* Inetd Pies::
 
 @end detailmenu
 @end menu
 
 @node Preface, Intro, Top, Top
 @unnumbered Preface
@@ -355,13 +402,12 @@ Use @dfn{black-}, @dfn{white-} and @dfn{greylisting} techniques.
 @item
 Invoke external programs or other mail filters.
 @end itemize
 
 @menu
 * History::                 Short @command{mailfromd} history. 
-* Conventions::             Typographical conventions.
 * Acknowledgments::         Acknowledgments.
 @end menu
 
 @node History
 @unnumberedsec Short history of @command{mailfromd}. 
 
@@ -424,49 +470,22 @@ beginnings of the @acronym{MFL} module system.  The code generation
 was re-implemented to facilitate the introduction of object files in
 future versions.  Another new features in this release include 
 @acronym{SPF} support and @command{mtasim} utility, an @acronym{MTA}
 simulator designed for testing @command{mailfromd} scripts
 (@pxref{mtasim}).  The test suite in this version was made portable by
 rewriting it in @i{Autotest}.  
-  
-@node Conventions
-@unnumberedsec Typographical conventions
-
-@cindex Texinfo
-  This manual is written using Texinfo, the GNU documentation
-formatting language.  The same set of Texinfo source files is used to
-produce both the printed and online versions of the documentation.
-@ifnotinfo
-Because of this, the typographical conventions
-may be slightly different than in other books you may have read.
-@end ifnotinfo
-@ifinfo
-This section briefly documents the typographical conventions used in
-this manual.
-@end ifinfo
-
-  Examples you would type at the command line are preceded by the common
-shell primary prompt, @samp{$}.  The command itself is printed @kbd{in
-this font}, and the output it produces @samp{in this font}, for
-example:
 
-@smallexample
-$ @kbd{mailfromd --version}
-mailfromd (mailfromd @value{VERSION})
-@end smallexample
-
-In the text, the command names are printed @command{like this},
-command line options are displayed in @option{this font}.  Some
-notions are emphasized @emph{like this}, and if a point needs to be made
-strongly, it is done @strong{this way}.  The first occurrence of
-a new term is usually its @dfn{definition} and appears in the same
-font as the previous occurrence of ``definition'' in this sentence.
-File names are indicated like this: @file{/path/to/ourfile}.
-
-The variable names are represented @var{like this}, keywords and
-fragments of program text are written in @code{this font}.
+  Another big leap forward was the 5.0 release, which appeared on
+December 26, 2008.  It largely enriched a set of available functions
+(61 new functions were introduced, which amounts to 41% of all the
+available functions in 5.0 release) and introduced several
+improvements in the MFL itself.  Among others, function aliases and
+optional arguments in user-defined functions were introduced in this
+release.  The new ``run operation mode'' allowed to execute arbitrary
+MFL functions from the command line.  This release also raised the
+Mailutils version requirements to at least 2.0.
 
 @node Acknowledgments
 @unnumberedsec Acknowledgments
 
   Many people need to be thanked for their assistance in developing
 and debugging @command{mailfromd}.  After S.@: C.@: Johnson, I can say
@@ -500,12 +519,21 @@ comments.  He offered invaluable help in debugging and testing
 @command{mailfromd} on @acronym{FreeBSD} platform.  
 
 @cindex Sergey Afonin
   Sergey Afonin proposed many improvements and new ideas.  He also
 invested a lot of his time in finding bugs and testing bugfixes. 
 
+@cindex John McEleney
+@cindex Ben McKeegan
+  John McEleney and Ben McKeegan contributed the token bucket filter
+implementation (@code{tbf_rate} function, @FIXME-pxref{}).
+  
+@cindex Con Tassios
+  Con Tassios helped to find and fix various bugs and contributed the
+new implementation of the @code{greylist} function (@FIXME-pxref{}).  
+
   The following people (in alphabetical order) provided bug reports
 and helpful comments for various versions of the program:
 @cindex Alan Dobkin
 @cindex Brent Spencer
 @cindex Jeff Ballard
 @cindex Nacho Gonz@'alez L@'opez
@@ -527,18 +555,19 @@ supports @command{Milter} (or @command{Pmilter}) protocol.  It is able
 to filter both incoming and outgoing messages using a filter program,
 written in @dfn{mail filtering language} (@acronym{MFL}).  The daemon
 interfaces with the @acronym{MTA} using @command{Milter} protocol.
 
   The name @command{mailfromd} can be thought of as an abbreviation for
 @samp{@emph{Mail} @emph{F}iltering and @emph{R}untime
-@emph{M}odification}, with an @samp{o} for itself.  Historically, it
-stemmed from the fact that the original implementation was a simple
-filter implementing the @dfn{sender address verification} technique.
-Since then the program has changed dramatically, and now it is
-actually a language translator and run-time evaluator providing a set
-of built-in and library functions for filtering electronic mail. 
+@emph{M}odification} @emph{D}aemon, with an @samp{o} for itself.
+Historically, it stemmed from the fact that the original
+implementation was a simple filter implementing the @dfn{sender
+address verification} technique. Since then the program has changed
+dramatically, and now it is actually a language translator and
+run-time evaluator providing a set of built-in and library functions
+for filtering electronic mail.
 
   The first part of this manual is an overview, describing the features
 @command{mailfromd} offers in general.
 
   The second part is a tutorial, which provides an introduction for
 those who have not used @command{mailfromd} previously.  It moves from
@@ -553,17 +582,56 @@ from time to time.  Each chapter presents everything that needs to be
 said about a specific topic.
 
   The manual assumes that the reader has a good knowledge of the
 @acronym{SMTP} protocol and @command{Sendmail} mail transport system.   
       
 @menu
+* Conventions::             Typographical conventions.
 * Overview::                Mailfromd at a first glance
 * SAV::                     Principles of Sender Address Verification.
 * Rate Limit::              Controlling Mail Sending Rate.
+* SPF::                     Sender Policy Framework.
 @end menu
 
+@node Conventions
+@section Typographical conventions
+
+@cindex Texinfo
+  This manual is written using Texinfo, the GNU documentation
+formatting language.  The same set of Texinfo source files is used to
+produce both the printed and online versions of the documentation.
+@ifnotinfo
+Because of this, the typographical conventions
+may be slightly different than in other books you may have read.
+@end ifnotinfo
+@ifinfo
+This section briefly documents the typographical conventions used in
+this manual.
+@end ifinfo
+
+  Examples you would type at the command line are preceded by the common
+shell primary prompt, @samp{$}.  The command itself is printed @kbd{in
+this font}, and the output it produces @samp{in this font}, for
+example:
+
+@smallexample
+$ @kbd{mailfromd --version}
+mailfromd (mailfromd @value{VERSION})
+@end smallexample
+
+In the text, the command names are printed @command{like this},
+command line options are displayed in @option{this font}.  Some
+notions are emphasized @emph{like this}, and if a point needs to be made
+strongly, it is done @strong{this way}.  The first occurrence of
+a new term is usually its @dfn{definition} and appears in the same
+font as the previous occurrence of ``definition'' in this sentence.
+File names are indicated like this: @file{/path/to/ourfile}.
+
+The variable names are represented @var{like this}, keywords and
+fragments of program text are written in @code{this font}.
+
 @node Overview
 @section Mailfromd at a first glance
 
   In contrast to the most existing milter filters,
 @command{mailfromd} does not implement any default filtering
 policies.  Instead, it depends entirely on a @dfn{filter script},
@@ -728,18 +796,33 @@ doing the forwarding.
 
 @node Rate Limit
 @section Controlling Mail Sending Rate.
 
 @cindex mail sending rate, explained
 @cindex sending rate, explained
-  @dfn{Mail Sending Rate} for a given identity is defined as number of
+  @dfn{Mail Sending Rate} for a given identity is defined as the number of
 messages with this identity received within a predefined interval of
 time.  
 
-  @acronym{MFL} offers a special function @code{rate} (@pxref{rate})
-that computes the sending rate for the given identity.  
+  @acronym{MFL} offers a set of functions for limiting mail sending
+rate (@pxref{Rate limiting functions}), and for controlling broader
+rate aspects, such as data transfer rates (@pxref{TBF}).
+
+@node SPF
+@section SPF
+@cindex @acronym{SPF}
+@cindex Sender Policy Framework
+  @dfn{Sender Policy Framework}, or @acronym{SPF} for short, is an
+extension to @acronym{SMTP} protocol that allows to identify forged
+identities supplied with the @code{MAIL FROM} and @code{HELO}
+commands.  The framework is explained in detail in @acronym{RFC} 4408
+(@uref{http://tools.ietf.org/html/rfc4408}) and on the 
+@uref{http://www.openspf.org/, SPF Project Site}.
+
+  Mailfromd provides a set of functions (@pxref{SPF Functions}) for
+using @acronym{SPF} to control mail flow.  
 
 @node Building, Tutorial, Intro, Top
 @chapter Building the Package
 
 @cindex building @command{mailfromd}
 @cindex @command{mailfromd}, building
@@ -916,13 +999,13 @@ variable.
 @cindex DEFAULT_EXPIRE_RATES_INTERVAL, @command{configure} variable
   There are also two variables that allow to control particular
 expiration intervals: @code{DEFAULT_DNS_NEGATIVE_EXPIRE_INTERVAL} sets
 expiration time for cached negative @acronym{DNS} answers (@pxref{DNS Cache
 Management}) (default 3600 seconds) and
 @code{DEFAULT_EXPIRE_RATES_INTERVAL} sets default expiration time for
-mail rate database (@pxref{rate}).
+mail rate database (@pxref{Rate limiting functions}).
 
 Expiration settings can be changed at run time using
 @samp{#pragma database} statement in the filter script file
 (@pxref{database}).   
 
 @cindex enable-syslog-async, @option{--enable-syslog-async}, @command{configure} option
@@ -1022,23 +1105,29 @@ and mode.
 (@file{@var{sysconfdir}/mailfromd.etc}) and edit it, if necessary.  If
 you are upgrading from an older version of @command{mailfromd}, see 
 the corresponding section below.
 @end enumerate
 
 @menu
+* 500-510::  Upgrading from 5.0 to 5.1
 * 440-500::  Upgrading from 4.4 to 5.0
 * 43x-440::  Upgrading from 4.3.x to 4.4
 * 420-43x::  Upgrading from 4.2 to 4.3.x
 * 410-420::  Upgrading from 4.1 to 4.2
 * 400-410::  Upgrading from 4.0 to 4.1
 * 31x-400::  Upgrading from 3.1.x to 4.0
 * 30x-31x::  Upgrading from 3.0.x to 3.1
 * 2x-30x::   Upgrading from 2.x to 3.0.x
 * 1x-2x::    Upgrading from 1.x to 2.x
 @end menu
 
+@node 500-510
+@section Upgrading from 5.0 to 5.1
+@cindex Upgrading from 5.0 to 5.1
+@WRITEME
+
 @node 440-500
 @section Upgrading from 4.4 to 5.0
 @cindex Upgrading from 4.4 to 5.0
 
   This version of Mailfromd requires
 @uref{http://www.gnu.org/software/mailutils, GNU mailutils} version
@@ -1409,13 +1498,14 @@ done
 
 @noindent
 @xref{Handlers}, for more information about the @code{prog} statement.
 
 @item
 If your code contained any @code{rate} statements, convert them to
-function calls (@pxref{rate}), using the following scheme:
+function calls (@pxref{Rate limiting functions, rate}), using the
+following scheme: 
 
 @smallexample
 @group
 Old statement: if rate @var{key} @var{limit} / @var{expr}
 New statement: if rate(@var{key}, interval("@var{expr}")) > @var{limit}
 @end group
@@ -2359,73 +2449,175 @@ representing the average number of messages per second sent by this
 @code{key} within the last sampling interval.  In the simplest case,
 the sender email address can be used as a @code{key}, however we recommend
 to use a conjunction @var{email}-@var{sender_ip} instead, so the
 actual @var{email} owner won't be blocked by actions of some spammer
 abusing his/her address.
 
-  To control and update sending rates, the @code{rate} function is
-provided.  It takes two mandatory arguments: @code{key}, whose meaning
-is described above, and @code{interval}, or the number of seconds, to
-which the actual sending rate value is converted.  Remember, that it is
-stored internally as a floating point number, and thus cannot be used
-in @command{mailfromd} filters, which operate only on integer numbers.
-To use the rate value, it is first converted to messages per given
-interval, which is an integer number.  For example, the rate
+  Two functions are provided to control and update sending rates.  The
+@code{rateok} function takes three mandatory arguments:
+
+@smallexample
+  bool rateok(string @var{key}, number @var{interval}, number @var{threshold})
+@end smallexample
+
+The @var{key} meaning is described above.  The @var{interval} is the
+sampling interval, or the number of seconds to which the actual
+sending rate value is converted.  Remember that it is stored
+internally as a floating point number, and thus cannot be directly
+used in @command{mailfromd} filters, which operate only on integer
+numbers.  To use the rate value, it is first converted to messages per
+given interval, which is an integer number.  For example, the rate
 @code{0.138888} brought to 1-hour interval gives @code{500}
 (messages per hour).
 
-  Wherever the @code{rate} function is called, it recomputes and
-updates the rate record for the given @var{key}, and returns its
-value, converted to messages per interval.  For example, the following
-code limits the mail sending rate for each @samp{email
-address}-@samp{@acronym{IP}} combination to 180 per hour.  If the
-actual rate value exceeds this limit, the sender is returned a
+  Wherever the @code{rateok} function is called, it recomputes the
+rate record for the given @var{key}.  If the new rate value converted
+to messages per given @var{interval} is less than @var{threshold},
+the function updates the database and returns @code{True}.  Otherwise it
+returns @code{False} and does not update the database.
+
+  This function must be @dfn{required} prior to use, by placing the
+following statement somewhere at the beginning of your script:
+
+@smallexample
+#require rateok
+@end smallexample
+
+For example, the following code limits the mail sending rate for each
+@samp{email address}-@samp{@acronym{IP}} combination to 180 per hour.
+If the actual rate value exceeds this limit, the sender is returned a
 temporary failure response: 
 
 @smallexample
 @group
+#require rateok
+
 prog envfrom
 do
-  if rate($f "-" $@{client_addr@}, 3600) > 180
+  if not rateok($f "-" $@{client_addr@}, 3600, 180)
     tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later"
   fi
 done
 @end group
 @end smallexample
 
 @noindent
   Notice argument concatenation, used to produce the key.
 
   It is often inconvenient to specify intervals in seconds,
-therefore a special @code{interval} function is provided, which
+therefore a special @code{interval} function is provided.  It 
 converts its argument, which is a textual string representing time
 interval in English, to the corresponding number of seconds.  Using
 this function, the function invocation would be:
 
 @smallexample
-     rate($f "-" $@{client_addr@}, interval("1 hour"))
+     rateok($f "-" $@{client_addr@}, interval("1 hour"), 180)
 @end smallexample
 
   The @code{interval} function is described in @ref{interval}, and time
 intervals are discussed in @ref{time interval specification}. 
 
-  The @code{rate} function begins returning non-zero value as soon as
-it has enough data to compute the rate. By default, it needs at least
-two mails. Since this may lead to a big number of false positives
+  The @code{rateok} function begins computing the rate 
+as soon as it has collected enough data. By default, it needs at least
+four mails.  Since this may lead to a big number of false positives
 (i.e. overestimated rates) at the beginning of sampling interval,
-there is a way to specify a minimum number of samples @code{rate} must
-collect before starting to actually compute rates. This number of
-samples is given as the optional third argument to the function. For
-example, the following call will return 0 unless at least 10 mails
-with the given key value were detected:
+there is a way to specify a minimum number of samples @code{rateok}
+must collect before starting to actually compute rates. This number of
+samples is given as the optional fourth argument to the function.  For
+example, the following call will always return @code{True} for the
+first 10 mails, no matter what the actual rate:
+
+@smallexample
+     rateok($f "-" $@{client_addr@}, interval("1 hour"), 180, 10)
+@end smallexample
+
+@anchor{TBF}
+  The @code{tbf_rate} function allows to exercise more control over
+the mail rates.  This function implements a @dfn{token bucket filter}
+(@acronym{TBF}) algorithm.
+
+  The token bucket controls when the data can be transmitted based on
+the presence of abstract entities called @dfn{tokens} in a container
+called @dfn{bucket}.  Each token represents some amount of data.  The
+algorithm works as follows:
+
+@itemize @bullet
+@item A token is added to the bucket at a constant rate of 1 token per
+@var{t} microseconds.
+@item A bucket can hold at most @var{m} tokens.  If a token arrives
+when the bucket is full, that token is discarded.
+@item When @var{n} items of data arrive (e.g. @var{n} mails), @var{n}
+tokens are removed from the bucket and the data are accepted.
+@item If fewer than @var{n} tokens are available, no tokens are
+removed from the bucket and the data are not accepted.
+@end itemize
+
+  This algorithm allows to keep the data traffic at a constant rate
+@var{t} with bursts of up to @var{m} data items.  Such bursts occur
+when no data was being arrived for @var{m}*@var{t} or more
+microseconds.  
+
+  @command{Mailfromd} keeps buckets in a database @samp{tbf}.  Each
+bucket is identified by a unique @dfn{key}.  The @code{tbf_rate}
+function is defined as follows:
 
 @smallexample
-     rate($f "-" $@{client_addr@}, interval("1 hour"), 10)
+ bool tbf_rate(string @var{key}, number @var{n}, number @var{t}, number @var{m})
 @end smallexample
 
-For additional information about @code{rate} function, see @ref{rate}.  
+The @var{key} identifies the bucket to operate upon.  The rest of
+arguments is described above.  The @code{tbf_rate} function returns
+@samp{True} if the algorithm allows to accept the data and
+@samp{False} otherwise.
+
+Depending on how the actual arguments are selected the @code{tbf_rate}
+function can be used to control various types of flow rates.  For
+example, to control mail sending rate, assign the arguments as
+follows: @var{n} to the number of mails and @var{t} to the control
+interval in microseconds:
+
+@smallexample
+@group
+prog envfrom
+do
+  if not tbf_rate($f "-" $client_addr, 1, 10000000, 20)
+    tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later"
+  fi
+done
+@end group
+@end smallexample
+
+The example above permits to send at most one mail each 10 seconds.
+The burst size is set to 20.
+
+Another use for the @code{tbf_rate} function is to limit the total
+delivered mail size per given interval of time.  To do so, the
+function must be used in @code{prog eom} handler, because it is the
+only handler where the entire size of the message is known.   The
+@var{n} argument must contain the number of bytes in the email (or
+email bytes * number of recipients), and the @var{t} must be set to
+the number of bytes per microsecond a given user is allowed to send.  The
+@var{m} argument must be large enough to accommodate a couple of 
+large emails.  E.g.:
+
+@smallexample
+@group
+  prog eom
+  do
+    if not tbf_rate($f "-" $client_addr,
+                    message_size(current_message()),
+                    10240*1000000,  # At most 10 kb/sec
+                    10*1024*1024) 
+      tempfail 450 4.7.0 "Data sending rate exceeded. Try again later"
+    fi
+  done
+@end group
+@end smallexample
+
+@xref{Rate limiting functions}, for more information about
+@code{rateok} and @code{tbf_rate} functions.
 
 @node Greylisting
 @section Greylisting
 
   Greylisting is a simple method of defending against the spam
 proposed by Evan Harris.  In few words, it consists in recording the 
@@ -2499,12 +2691,55 @@ done
   In real life you will have to avoid greylisting some messages, in
 particular those coming from the @samp{<>} address and from the @acronym{IP}
 addresses in your relayed domain.  It can easily be done using the
 techniques described in previous sections and is left as an exercise
 to the reader.
 
+@anchor{greylisting types}
+@cindex greylisting types
+@cindex greylisting, traditional
+  @code{Mailfromd} provides two implementations of greylisting
+primitives, which differ in the information stored in the database.
+The one described above is called @dfn{traditional}.  It keeps in the
+database the time when the greylisting was activated for the given
+key, so the @code{greylisting} function uses its second argument
+(@code{interval}) and the current timestamp to decide whether the key
+is still greylisted.
+
+@cindex greylisting, Con Tassios type
+@cindex Con Tassios greylisting type
+  The second implementation is called by the name of its inventor
+@dfn{Con Tassios}.  This implementation stores in the database the
+time when the greylisting period is set to expire, computed by the
+@code{greylist} when it is first called for the given key, using the
+formula @samp{current_timestamp + interval}.  Subsequent calls to
+@code{greylist} compare the current timestamp with the one stored in
+the database and ignore their second argument.  This implementation is
+enabled by one of the following pragmas:
+
+@smallexample
+#pragma greylist con-tassios
+@end smallexample
+@noindent
+or
+@smallexample
+#pragma greylist ct
+@end smallexample
+
+  When Con Tassios implementation is used, yet another function
+becomes available.  The function @code{is_greylisted} returns
+@samp{True} if its argument is greylisted and @samp{False} otherwise.
+It can be used to check for the greylisting status without actually
+updating the database:
+
+@smallexample
+  if is_greylisted($@{client_addr@} "-" $f "-" $@{rcpt_addr@})
+    @dots{}
+  fi
+@end smallexample
+
 @anchor{whitelisting}
 @cindex whitelisting
   One special case is @dfn{whitelisting}, which is often used
 together with greylisting.  To implement it, @command{mailfromd}
 provides the function @code{dbmap}, which takes two mandatory arguments:
 @code{dbmap(@var{file}, @var{key})} (it also allows an optional third
@@ -2726,37 +2961,52 @@ check for expired entries.
 the first field set to @code{success}, and a @dfn{negative expiration}
 period, applied to entries marked as @code{not_found}.
 
 @cindex rate database
 @item rate
 The mail sending rate data, maintained by @code{rate} function
-(@pxref{rate}).  A record consists of the following fields:
+(@pxref{Rate limiting functions}).  A record consists of the following fields:
 
-@enumerate 1
-@item
-Timestamp.  The time when the entry was entered into the database.  It
-is used to check for expired entries.
+@table @asis
+@item timestamp
+The time when the entry was entered into the database.
 
-@item
+@item interval
 Interval during which the rate was measured (seconds).
 
-@item
+@item count
 Number of mails sent during this interval.
+@end table
 
-@item
-Actual mail sending rate.
+@cindex tbf database
+@item tbf
+This database is maintained by @code{tbf_rate} function (@pxref{TBF}).
+Each record represents a single bucket and consists of the following
+keys:
 
-@item
-Expected rate, i.e. the mail sending rate that would have been achieved if
-this sender had sent an email right now.
-@end enumerate
+@table @asis
+@item timestamp
+Timestamp of most recent token, as a 64-bit unsigned integer
+(microseconds resolution).
+
+@item expirytime
+Estimated time when this bucket expires (seconds since epoch).
+
+@item tokens
+Number of tokens in the bucket (@code{size_t}).
+@end table
 
 @cindex greylist database
 @item greylist
 This database is maintained by @code{greylist} function
 (@pxref{Greylisting}).  Each record holds only the timestamp.
+Its semantics depends on the greylisting implementation in
+use (@pxref{greylisting types}).  In traditional implementation, it
+is the time when the entry was entered into the database.  In Con
+Tassios implementation, it is the time when the greylisting period
+expires.
 @end table
 
 @node Basic Database Operations
 @subsection Basic Database Operations
 
 @cindex database, listing
@@ -3800,13 +4050,13 @@ function}).  There is a serious error, however: @code{hostname} is not
 a built-in function as it used to be in previous releases@footnote{Up to
 the version 1.3.91.}, and therefore it needs to be defined or required
 prior to using.  Otherwise it is no more than a literal, and the whole
 construct @samp{hostname($client_addr)} is regarded by @acronym{MFL}
 compiler as a concatenation of the string @samp{hostname} and the
 value of @samp{client_addr} Sendmail variable.  It is easy to see
-using @option{--dump-tree} option:
+using the @option{--dump-tree} option:
 
 @smallexample
 $ @kbd{mailfromd --dump-tree test.mf}
 State handlers:
 ---------------
 envfrom:
@@ -3883,13 +4133,13 @@ prog envfrom
 do
   echo "X is %x"
 done
 @end smallexample
 
 Does @code{%x} in @code{echo} refers to the variable or to the
-constant?  The correct answer is @samp{to the constant}; when executed this
+constant?  The correct answer is @samp{to the variable}.  When executed this
 code will print @samp{X is X}.  
 
 The reason for such a name clash is entirely artificial.  Indeed, the
 following code does not produce any ambiguity:
 
 @smallexample
@@ -3899,12 +4149,13 @@ string x "X"
 prog envfrom
 do
   echo "X is " x
 done
 @end smallexample
 
+This is because @code{x} alone can refer only to constant.
 Problems begin when we need to expand a constant in a
 literal string.  The only way to do so is by prefixing its name with a
 @samp{%}, just as if it were variable, and that produces the
 ambiguity.
 
 As of version @value{VERSION}, @command{mailfromd} will always print a
@@ -4112,12 +4363,13 @@ and @samp{#pragma regex} controls the compilation of regular expressions.
 @menu
 * option::      Pragma option.
 * database::    Pragma database.
 * stac
author	Sergey Poznyakoff <gray@gnu.org.ua>	2009-05-04 23:45:51 +0300
committer	Sergey Poznyakoff <gray@gnu.org.ua>	2009-05-04 23:45:51 +0300
commit	2e9e8087eef5452299cfc839b7b4cc23bb2feb6c (patch)
tree	c6ad298fbc1fa8bded0d37b17dd190bebb31c4b3
parent	9503ac3090f1013ab501e863d56277cccf301a86 (diff)
download	mailfromd-2e9e8087eef5452299cfc839b7b4cc23bb2feb6c.tar.gz mailfromd-2e9e8087eef5452299cfc839b7b4cc23bb2feb6c.tar.bz2